Crypto – Public Key with Close Gap p and q Prime Number Vulnerability

Not long ago, I came across a crypto challenge about the generation of private and public keys using p and q that is too close. The use of p and q prime numbers with a gap that is too close turns out vulnerable because both prime values can be brute-forced. Using n (modulus) that is stored in the public key, the p and q can be obtained to calculate the d value that is being used in generation of a private key.

I am assuming you already know about how public and private keys are generated through mathematical calculation.

file: encrypt.py

from Crypto.Util.number import getPrime, bytes_to_long
from Crypto.PublicKey import RSA
from  sympy import nextprime

p = getPrime(1024)
q = nextprime(p)

cnt = 0

while cnt< 9:
    q = nextprime(q)
    cnt+=1

message = "We can't be too close or bad things may happen!"

n = p*q
e = 65537

message_long = bytes_to_long(bytes(message.encode()))

ciphertext = pow(message_long, e, n)

key = RSA.construct((n, e)).exportKey().decode()

with open("key.pem", "w") as f:
    f.write(key)

with open("ciphertext", "w") as f:
    f.write(str(ciphertext))


print("P: \n"+str(p))
print("Q: \n"+str(q))
print("N: \n"+str(n))
print("Ciphertext:\n"+str(ciphertext))

f = open("key.pem", "r")
keys = RSA.importKey(f.read())
f.close()
print("e:\n"+str(keys.e))
print("n:\n"+str(keys.n))

Consider the Python code in encrypt.py. The code chooses a random 1024-bit prime number value for p. Then chooses the next prime number after the one in p for q. After that, the code does another 9 times loops to get the next prime number after the one in q and assign it to q. Variable p now has a value of a randomly chosen prime number and q is a prime number that has a value 10 prime numbers away from p. Illustration using a small prime number, if p is equal to 13, then q will have a value of 53. If p is equal to 19, then q will have a value of 61.

After choosing the prime number values for p and q, the value for n (modulus) is calculated by multiplying p and q. The value for e (public exponent) is 65537. You can find out more about the history behind the number 65537 as a public exponent by google search. Or from this question in StackOverflow.

In the code, you can see the message “We can’t be too close or bad things may happen!”. Since the value is a string, the message is encoded and then converted to bytes. After that using the bytes_to_long function, the message bytes are converted to long for encryption.

The message is encrypted using the calculation of pow(message, e, n). The value of the message is being raised to the power of e, then mod it by n. The encryption result is being assigned to variable ciphertext.

After that, a public key is generated using n and e, then stored as key.pem. The ciphertext is also stored as a file named ciphertext.

Here are examples of files generated after running the encrypt.py:

file: ciphertext

12707629892018805300777227585889543661609135585978370872837666680608231130311120939755822759144533281817972747162986487048825609623415820470603568131694568808740937567171158429945477968142303614657011303830759291434472090398722176826477307382528174315437588159950997675035483002253815436358984749717247348430401389563846655573077769449260421015982686858970355220829289540001221152941163224171675064959664442601006400110531191338292400496991585504806140617165304033128796155225400861291136476496103048100591721171401387045754769412142469693080215886520422116785340684933025115132245299895256283307730949601520501928190

file: key.pem

So how do we solve this challenge? How do we generate d to decipher the ciphertext? This video here will help us understand how to solve this challenge.

We know the value of n is equal to p times q and both p and q have a close gap (small difference value). By using Fermat’s Factorization method the value of p and q can be calculated. Fermat’s Factorization method is based on the representation of an odd integer as the difference between two squares.

N=a² – b²

The equation a² – b² is algebraically factorable as (a+b)(a-b). Thus:

N = (a+b)(a-b)

We can say that p=(a+b) and q=(a-b). Thus:

N = p*q

Since n is a prime number, this method will perfectly help us solve the challenge. First, we need to set the value of a as the ceiling square root of n. After setting the a value, we need to find the b value using loops. Using this equation:

b² = a² – n

we will keep doing loops and increasing the value of a by one until we find b² value that is a square. Here is a part of the code that is going to be used to get the value of p and q:

from Crypto.PublicKey import RSA
from Crypto.Util.number import long_to_bytes
import gmpy2

##Read ciphertext
with open("ciphertext", "r") as f:
    ciphertext = int(f.read())

##Import key.pem
f = open("key.pem", "r")
keys = RSA.importKey(f.read())
f.close()

n = keys.n
e = keys.e

gmpy2.get_context().precision=1025
newN = gmpy2.mpz(n)
a = gmpy2.sqrt(newN)

while True:
    b2 = pow(gmpy2.mpz(a), 2) - newN
    if gmpy2.is_square(b2):
        b = gmpy2.sqrt(gmpy2.mpz(b2))
        break
    a = gmpy2.sqrt(a) + 1

p = int(a+b)
q = int(a-b)

First, import both the ciphertext and public-key key.pem. Using RSA.importKey, we can read the attributes in key.pem. Get the n and e from the key.pem and assign the values to its respective variable. The code is using library gmpy2, because square root calculation of big numbers using python math usually results in an error. Put n into newN using gympy2.mpz to support multi-precision calculation. Then calculate the a as the square root of newN using gmpy2.sqrt(newN).

The next step is to do brute-force calculation using this part:

while True:
    b2 = pow(gmpy2.mpz(a), 2) - newN
    if gmpy2.is_square(b2):
        b = gmpy2.sqrt(gmpy2.mpz(b2))
        break
    a = gmpy2.sqrt(a) + 1

Calculate the value of b2 as the difference between a² and newN. Then check if b2 is a square using gmpy2.is_square(b2). If b2 is a square, then calculate the b as the square root of b2 using gmpy2.sqrt(gmpy2.mpz(b2)) and break from the loops. If b2 is not a square, then add one to the value of a and continue to the next iteration.

After we break from the loops, p will be calculated as (a+b) and q as (a-b). The p-value obtained from the brute force:

137736997922751217765372999668393639524132213552435676101448712778223640662171429634611296687402192149028746048325757696374641634018658024955503821444197663174591429560813421116205793897659340489592455858538226311916555835662281009702190429180761240344489027163204625237253382407021712108154351081855746512543

The q-value obtained from the brute force:

137736997922751217765372999668393639524132213552435676101448712778223640662171429634611296687402192149028746048325757696374641634018658024955503821444197663174591429560813421116205793897659340489592455858538226311916555835662281009702190429180761240344489027163204625237253382407021712108154351081855746508171

You can check whether p and q are prime using gmpy2.is_prime(p) and print(gmpy2.is_prime(q)).

totientN = (p-1)*(q-1)
d = gmpy2.invert(e, totientN)

Since p and q are already obtained, now the value of d can be calculated. Calculate the totient(ϕ) value of n, then using totient(ϕ) and the value of e (65537) a modular multiplicative inverse can be done to get the value of d.

For simplicity, we are just going to use gmpy2.invert(e, totientN). You can read more about modular multiplicative inverse by google search.

At this point, all the values needed to decipher the ciphertext are already obtained. The next step is to decipher the ciphertext.

message_long = pow(ciphertext, d, n)
message = long_to_bytes(message_long)
print("Message: "+str(message))

To decipher the ciphertext, just do the calculation of pow(ciphertext, d, n). The value of ciphertext is being raised to the power of d, then mod it by n. After the calculation, the result is still in a long type so we need to convert it to bytes using Crypto.Util.number.long_to_bytes. After running the message = long_to_bytes(message_long) and printing the value of the message out, you will get a result similar to the image below.

The full source code to solve the challenge is as you can see below:

solver.py

from Crypto.PublicKey import RSA
from Crypto.Util.number import long_to_bytes
import gmpy2
import time

##Read ciphertext
with open("ciphertext", "r") as f:
    ciphertext = int(f.read())

##Import key.pem
f = open("key.pem", "r")
keys = RSA.importKey(f.read())
f.close()

n = keys.n
e = keys.e

gmpy2.get_context().precision=1025
newN = gmpy2.mpz(n)
a = gmpy2.sqrt(newN)
start_time = time.time()
while True:
    b2 = pow(gmpy2.mpz(a), 2) - newN

    if gmpy2.is_square(b2):
        b = gmpy2.sqrt(gmpy2.mpz(b2))
        break
    a = gmpy2.sqrt(a) + 1
end_time = time.time()
p = int(a+b)
q = int(a-b)
print("Exec. time: "+str(end_time-start_time))
print(gmpy2.is_prime(p))
print(gmpy2.is_prime(q))
print("P: \n"+str(p))
print("Q: \n"+str(q))
print("N: \n"+str(n))
print("N: \n"+str(gmpy2.mpz(p*q)))

totientN = (p-1)*(q-1)
d = gmpy2.invert(e, totientN)

message_long = pow(ciphertext, d, n)
message = long_to_bytes(message_long)
print("Message: "+str(message))