RSA Cipher
Created:
2017-09-25
Updated:
2017-09-25
Trivial Attempt
Let’s first try to build a scheme directly using the Fermat’s Little Theorem.
Given a prime number p
and a message m < p
, we choose two numbers e
and
d
such that e·d = 1 (mod p-1)
.
- Encryption:
c = m^e (mod p)
- Decryption:
m = c^d (mod p)
Proof. c^d = m^(e·d) ≡ m^(e·d mod p-1) = m (mod p)
∎
Attack
Assuming that the couple e
, n
is the public key.
Because e·d = 1 (mod p-1)
then 1 = e·d + (p-1)·y
With the Euclidean’s algorithm we can easily find the “secret” d
.
The weakness can be addressed by replacing the modulo p
with a composite
number n
. To apply the attack we now require to use the Euler’s theorem and
thus to find φ(n)
by factoring n
. When the prime factors of n
are big
enough this is believed to be a hard problem.
RSA Scheme
We choose two big primes p
and q
, and we set the modulo n = p·q
. The keys
should satisfy the following property:
e·d = 1 (mod φ(n))
In other words, (e, φ(n)) = 1
.
- Encryption:
c = m^e (mod n)
- Decryption:
m = c^d (mod n)
Proof.
If (m,n) = 1
then thesis immediately follows from Euler’s theorem:
c^d = m^(e·d) ≡ m^(e·d mod φ(n)) = m (mod n)
If (m,n) ≠ 1
then we should have (m,p) ≠ 1
or (m,q) ≠ 1
but not both
otherwise m = p·q·k = 0 (mod n)
.
Let’s assume m = q·x
and (m,p) = 1
, then
m^[(p-1)·(q-1)·z] ≡ 1 (mod p)
→ m^[(p-1)·(q-1)·z] = 1 + p⋅k
For the key choice criteria:
e·d = 1 (mod φ(n)) → e·d = 1 + φ(n)·z = 1 + (p-1)·(q-1)·z
Then
m^(e·d) = m^[1 + (p-1)·(q-1)·z] = m·m^[(p-1)·(q-1)·z] = m·(1 + p·k)
= m + m·p·k = m + (q·x)·(p·k) = m + (q·p)(x·k) = m + n·w
≡ m (mod n)
∎
In practice generating a message not in Zn*
is very unlikely, but as been
shown, the cipher works anyway.
The probability of having a message with (m,n) = 1
is φ(n)/n
.
The Euler’s totient is:
φ(n) = n·(1-1/p)·(1-1/q)
The more p
and q
are big the more 1/p
and 1/q
are close to 0
, thus:
φ(n) ≈ n
This is also good for security.
If a number m
is not coprime with n
, because n = p·q
and m = p·k
or
q·k
(but not both) with k
less than the other prime, then (m,n) = p
or
(m,n) = q
.
In case of a known plaintext attack scenario, to factor n
, an attacker just
has to use the Euclid’s algorithm.
Security Considerations
The security of the RSA cipher mostly relies on the hardness of integer factorization.
While there are no known efficient algorithms for factoring large integers, the security of the cipher is not guaranteed. Advances in computer hardware, mathematics, and cryptography could potentially render the RSA cipher vulnerable to attacks in the future.
If an attacker can factor the modulus, then he can easily compute the decryption exponent and decrypt any message encrypted with the public key.
Analytical Attacks
The possible attack types can be classified by the end goal and sorted by complexity (most complex first):
- Determine the factors of
p
andq
ofn
- Determine the totient
φ(n)
- Determine the secret key
d
- Determine the plaintext
m
The complexity difference between the attacks is just polynomial. Thus, for example, if we have an algorithm to solve problem 1, then we can easily solve all the other problems as well.
Since today the factorization problem is considered hard, an attacker may be inclined trying to directly solve problem 4.
Can be proven that the first three problems are equivalent.
1 solves 2
If we know p
and q
then φ(n) = (p-1)·(q-1)
2 solves 1
Given φ(n)
we can easily find p
and q
φ(n) = (p-1)·(q-1) = p·q - p - q + 1 = n - p - q + 1 (note: q = n/p)
= n - p - n/p + 1
→ φ(n)·p = p·n - p^2 - n + p
→ p^2 + p(φ(n) - n - 1) + n = 0
Solving for p we find two roots who correspond to p
and q
.
2 solves 3
First he computes φ(n) = (p-1)·(q-1)
. Then, given that e·d = 1 (mod φ(n))
and public exponent e
is known, is just a matter of applying the Extended
Euclidean Algorithm to compute the inverse d
.
3 solves 1 (probabilistic)
Given d
we use a probabilistic algorithm that gives p
and q
and thus
φ(n)
.
-
Randomly choose
x ∈ Zn
. -
If
x = gcd(x, n) > 1
then we found a non-trivial factor ofn
, we return(p=x, q=n/x)
. This event has probability2/n
. -
Decompose
e·d - 1 = s·2^r
withs
odd (e·d
is odd, see lemma below) -
Compute the sequence:
x0 = x^s mod n = x^(s·2^0) mod n
x1 = x0^2 mod n = x^(s·2^1) mod n
- …
xr = x_(r-1)^2 mod n = x^(s·2^r) mod n = x^(ed - 1) mod n
For the encryption/decryption property of RSA:
x^(ed - 1) = x^ed · x^(-1) = x⋅x^(-1) ≡ 1 (mod n)
Thus starting from a
x0 ≠ 1
there should be at some point axi = ±1
, and for allj>i
we havexj = 1
. With probability1/2
we find thatxi = 1
. If insteadxi = -1
, then we repeat the procedure with anotherx
.We found
xi ≠ ±1
such thatxi^2 ≡ 1 (mod n)
and thusn
can’t be prime (ref. to Miller-Rabin primality testing).xi^2 - 1 = (xi-1)·(xi+1) ≡ 0 (mod n) → n | (xi-1)·(xi+1)
The factors of
n
can’t all be in the first factor, because otherwisen | (xi-1)
and thusxi ≡ 1 (mod n)
.This means that some factors are in
(xi-1)
and some others in(xi+1)
. Sogcd(n, xi-1)
should return some non-trivial factor ofn
:p
orq
.
Lemma: e·d
is odd.
Proof. Given that φ(n) = (p-1)·(q-1) = (2·x)·(2·y) = 2·z
and that (e,φ(n)) = 1
then e
should be odd, same for d
. Thus, e·d
is odd.
3 solves 4
If we have d
then we can then easily find m
using the decryption procedure.
4 doesn’t solve 3 (conjecture)
Can we find d
if we know an algorithm to find m
without knowing the key?
This is known as the RSAP (RSA Problem) and in practice here we’re asking if RSA is resistant to a chosen ciphertext attack.
There is a conjecture that postulates that this problem is as hard as the factorization problem. If not, RSA would be broken.
Side Channel Attacks
These attacks are using techniques which targets physical aspects of the cipher.
Some attack vectors:
- timing
- power consumption
- injected faults
Probably, the most popular one is the timing attack (Kocher-96).
When using a naive implementation, the execution time of the algorithm already says something about the secret exponent. For example, using the fast exponentiation algorithm where a 1 bit in the exponent corresponds to a multiplication.
The assumption is that the attacker is able to choose the ciphertext to decrypt. That is, a chosen ciphertext attack is possible.
Exponent bits are disclosed one after the other.
This type of attacks are very concrete and effective, more than analytical attacks that today are assumed to be infeasible.
Implementation solutions:
- Constant time: all executions paths perform as the worst case.
- Random delays: insert random delays independent of the exponent.
- Blinding
When attacking a remote system, the attacker requires having a local copy of the remote system. Where he can try different keys.
Blinding
Decryption is not directly applied to the ciphertext.
Steps:
- Choose a random blinding factor:
r ∈ Zn*
; - Blind the ciphertext:
c' = r^e · c = (r·m)^e
- Decrypt the blinded ciphertext:
m' = c'^d = (r·m)^ed = r·m
- Remove the blinding:
m = r^-1·r·m
Because we are decrypting a value not known to the attacker, he can’t replicate the operation in its local device copy.
Blinding the message introduces a performance degradation of approximately 10%.
Another approach is to blind the exponent.
Steps:
- Choose a random blinding factor:
r ∈ Zn*
- Blind the exponent:
d' = d + r·φ(p)
- Decrypt using the blinded exponent:
m = c^d' = c^d (mod p)
Padding
The described RSA cipher is known as textbook RSA.
Textbook RSA has some critical flaws:
Malleability
If an encrypted message is multiplied by a factor k^e
then the corresponding
decrypted message will result multiplied by the factor k
:
c = m^e → c' = k^e·c = (k·m)^e → m' = c'^d = k·m
This property allows to an attacker to manipulate an encrypted message to produce predictable results on the cleartext.
Dictionary Attack
By definition in a public key cipher the attacker has always access to the encryption key.
If the attacker knows the whole set of possible plaintexts then he can then construct a dictionary of all the possible ciphertexts using the public key.
When the attacker intercepts a ciphertext he can then easily determiner what is the corresponding plaintext using a dictionary lookup.
Padding Schemes
One common solution to the above issues is to introduce some random factor as padding to the original message. The recipient should be able to remove the randoms factor after decryption.
Popular padding schemes:
- PKCS#1 v1.5: Specified in PKCS#1 v1.5 standard and probably the most popular.
- OAEP (Optimal Asymmetric Encryption Padding): Specified in PKCS#1 v2.0.
- PSS (Probabilistic Signature Scheme): similar to OAEP for digital signatures.
Failures
Vulnerabilities that emerge in particular use cases of textbook RSA.
Common Modulus Failure
Encrypt the same message m
using the two different encryption keys but with
the same modulus n
:
c1 = m^e1 mod n
c2 = m^e2 mod n
Let’s assume that with high probability (e1,e2) = 1
. An attacker can thus use
the EEA to compute x
and y
such that e1·x + e2·y = 1
.
Finally, he can recover the plaintext:
c1' = c1^x = m^(e1·x)
c2' = c2^y = m^(e2·y)
c1'·c2' = m^(e1·x) · m^(e2·y) = m^(e1·x + e2·y) ≡ m (mod n)
Note that if y < 0
, set y = -a
, with a > 0
. Then c2^y = (c2^-1)^a
and
thus we also require that (c2,n) = 1
.
Low Exponent Failure
The public exponent is small and there are three keys that differs only for the modulus
(3, n1), (3, n2), (3, n3)
c1 = m^3 mod n1
c2 = m^3 mod n2
c3 = m^3 mod n3
Assume (ni, nj) = 1
. If not we already found a factor for the modulus, and
thus we can trivially recover the message m
.
The attacker writes a system of equations:
x ≡ c1 (mod n1)
x ≡ c2 (mod n2)
x ≡ c3 (mod n3)
Using the CRT we construct the unique solution x ∈ Zn
, with n = n1·n2·n3
.
The solution to this system is x = m^3
.
Note that m < n1
, m < n2
, m < n3
, thus m^3 < n1·n2·n3 = n
→ m^3 ∈ Zn
.
Thus, m
is x^-3
: the ordinary cubic root of x
with no modulus involved.
References
- Euler’s theorem
- Chinese Remainder Theorem RSA optimization
- Generation of big primes
- Simple RSA implementation (no crt)
- Common modulus attack PoC here
- CRT signature fault injection PoC here
- Constant time