RSA Cipher

Created: 2017-09-25
Updated: 2017-09-25

Trivial Attempt

Let’s first try to build a scheme directly using the Fermat’s Little Theorem.

Given a prime number p and a message m < p, we choose two numbers e and d such that e·d = 1 (mod p-1).

  • Encryption: c = m^e (mod p)
  • Decryption: m = c^d (mod p)

Proof. c^d = m^(e·d) ≡ m^(e·d mod p-1) = m (mod p)

Attack

Assuming that the couple e, n is the public key.

Because e·d = 1 (mod p-1) then 1 = e·d + (p-1)·y

With the Euclidean’s algorithm we can easily find the “secret” d.

The weakness can be addressed by replacing the modulo p with a composite number n. To apply the attack we now require to use the Euler’s theorem and thus to find φ(n) by factoring n. When the prime factors of n are big enough this is believed to be a hard problem.

RSA Scheme

We choose two big primes p and q, and we set the modulo n = p·q. The keys should satisfy the following property:

e·d = 1 (mod φ(n))

In other words, (e, φ(n)) = 1.

  • Encryption: c = m^e (mod n)
  • Decryption: m = c^d (mod n)

Proof.

If (m,n) = 1 then thesis immediately follows from Euler’s theorem:

c^d = m^(e·d) ≡ m^(e·d mod φ(n)) = m (mod n)

If (m,n) ≠ 1 then we should have (m,p) ≠ 1 or (m,q) ≠ 1 but not both otherwise m = p·q·k = 0 (mod n).

Let’s assume m = q·x and (m,p) = 1, then

  m^[(p-1)·(q-1)·z] ≡ 1 (mod p)
→ m^[(p-1)·(q-1)·z] = 1 + p⋅k

For the key choice criteria:

e·d = 1 (mod φ(n))  →  e·d  = 1 + φ(n)·z = 1 + (p-1)·(q-1)·z

Then

m^(e·d) = m^[1 + (p-1)·(q-1)·z] = m·m^[(p-1)·(q-1)·z] = m·(1 + p·k)
        = m + m·p·k = m + (q·x)·(p·k) = m + (q·p)(x·k) = m + n·w
        ≡ m (mod n)

In practice generating a message not in Zn* is very unlikely, but as been shown, the cipher works anyway.

The probability of having a message with (m,n) = 1 is φ(n)/n.

The Euler’s totient is:

φ(n) = n·(1-1/p)·(1-1/q)

The more p and q are big the more 1/p and 1/q are close to 0, thus:

φ(n) ≈ n

This is also good for security.

If a number m is not coprime with n, because n = p·q and m = p·k or q·k (but not both) with k less than the other prime, then (m,n) = p or (m,n) = q.

In case of a known plaintext attack scenario, to factor n, an attacker just has to use the Euclid’s algorithm.

Security Considerations

The security of the RSA cipher mostly relies on the hardness of integer factorization.

While there are no known efficient algorithms for factoring large integers, the security of the cipher is not guaranteed. Advances in computer hardware, mathematics, and cryptography could potentially render the RSA cipher vulnerable to attacks in the future.

If an attacker can factor the modulus, then he can easily compute the decryption exponent and decrypt any message encrypted with the public key.

Analytical Attacks

The possible attack types can be classified by the end goal and sorted by complexity (most complex first):

  1. Determine the factors of p and q of n
  2. Determine the totient φ(n)
  3. Determine the secret key d
  4. Determine the plaintext m

The complexity difference between the attacks is just polynomial. Thus, for example, if we have an algorithm to solve problem 1, then we can easily solve all the other problems as well.

Since today the factorization problem is considered hard, an attacker may be inclined trying to directly solve problem 4.

Can be proven that the first three problems are equivalent.

1 solves 2

If we know p and q then φ(n) = (p-1)·(q-1)

2 solves 1

Given φ(n) we can easily find p and q

  φ(n) = (p-1)·(q-1) = p·q - p - q + 1 = n - p - q + 1  (note: q = n/p)
       = n - p - n/p + 1
→ φ(n)·p = p·n - p^2 - n + p
→ p^2 + p(φ(n) - n - 1) + n = 0

Solving for p we find two roots who correspond to p and q.

2 solves 3

First he computes φ(n) = (p-1)·(q-1). Then, given that e·d = 1 (mod φ(n)) and public exponent e is known, is just a matter of applying the Extended Euclidean Algorithm to compute the inverse d.

3 solves 1 (probabilistic)

Given d we use a probabilistic algorithm that gives p and q and thus φ(n).

  1. Randomly choose x ∈ Zn.

  2. If x = gcd(x, n) > 1 then we found a non-trivial factor of n, we return (p=x, q=n/x). This event has probability 2/n.

  3. Decompose e·d - 1 = s·2^r with s odd (e·d is odd, see lemma below)

  4. Compute the sequence:

    • x0 = x^s mod n = x^(s·2^0) mod n
    • x1 = x0^2 mod n = x^(s·2^1) mod n
    • xr = x_(r-1)^2 mod n = x^(s·2^r) mod n = x^(ed - 1) mod n

    For the encryption/decryption property of RSA:

    x^(ed - 1) = x^ed · x^(-1) = x⋅x^(-1) ≡ 1 (mod n)

    Thus starting from a x0 ≠ 1 there should be at some point a xi = ±1, and for all j>i we have xj = 1. With probability 1/2 we find that xi = 1. If instead xi = -1, then we repeat the procedure with another x.

    We found xi ≠ ±1 such that xi^2 ≡ 1 (mod n) and thus n can’t be prime (ref. to Miller-Rabin primality testing).

    xi^2 - 1 = (xi-1)·(xi+1) ≡ 0 (mod n) → n | (xi-1)·(xi+1)

    The factors of n can’t all be in the first factor, because otherwise n | (xi-1) and thus xi ≡ 1 (mod n).

    This means that some factors are in (xi-1) and some others in (xi+1). So gcd(n, xi-1) should return some non-trivial factor of n: p or q.

Lemma: e·d is odd.

Proof. Given that φ(n) = (p-1)·(q-1) = (2·x)·(2·y) = 2·z and that (e,φ(n)) = 1 then e should be odd, same for d. Thus, e·d is odd.

3 solves 4

If we have d then we can then easily find m using the decryption procedure.

4 doesn’t solve 3 (conjecture)

Can we find d if we know an algorithm to find m without knowing the key?

This is known as the RSAP (RSA Problem) and in practice here we’re asking if RSA is resistant to a chosen ciphertext attack.

There is a conjecture that postulates that this problem is as hard as the factorization problem. If not, RSA would be broken.

Side Channel Attacks

These attacks are using techniques which targets physical aspects of the cipher.

Some attack vectors:

  • timing
  • power consumption
  • injected faults

Probably, the most popular one is the timing attack (Kocher-96).

When using a naive implementation, the execution time of the algorithm already says something about the secret exponent. For example, using the fast exponentiation algorithm where a 1 bit in the exponent corresponds to a multiplication.

The assumption is that the attacker is able to choose the ciphertext to decrypt. That is, a chosen ciphertext attack is possible.

Exponent bits are disclosed one after the other.

This type of attacks are very concrete and effective, more than analytical attacks that today are assumed to be infeasible.

Implementation solutions:

  • Constant time: all executions paths perform as the worst case.
  • Random delays: insert random delays independent of the exponent.
  • Blinding

When attacking a remote system, the attacker requires having a local copy of the remote system. Where he can try different keys.

Blinding

Decryption is not directly applied to the ciphertext.

Steps:

  • Choose a random blinding factor: r ∈ Zn*;
  • Blind the ciphertext: c' = r^e · c = (r·m)^e
  • Decrypt the blinded ciphertext: m' = c'^d = (r·m)^ed = r·m
  • Remove the blinding: m = r^-1·r·m

Because we are decrypting a value not known to the attacker, he can’t replicate the operation in its local device copy.

Blinding the message introduces a performance degradation of approximately 10%.

Another approach is to blind the exponent.

Steps:

  • Choose a random blinding factor: r ∈ Zn*
  • Blind the exponent: d' = d + r·φ(p)
  • Decrypt using the blinded exponent: m = c^d' = c^d (mod p)

Padding

The described RSA cipher is known as textbook RSA.

Textbook RSA has some critical flaws:

Malleability

If an encrypted message is multiplied by a factor k^e then the corresponding decrypted message will result multiplied by the factor k:

c = m^e  →  c' = k^e·c = (k·m)^e  →  m' = c'^d = k·m

This property allows to an attacker to manipulate an encrypted message to produce predictable results on the cleartext.

Dictionary Attack

By definition in a public key cipher the attacker has always access to the encryption key.

If the attacker knows the whole set of possible plaintexts then he can then construct a dictionary of all the possible ciphertexts using the public key.

When the attacker intercepts a ciphertext he can then easily determiner what is the corresponding plaintext using a dictionary lookup.

Padding Schemes

One common solution to the above issues is to introduce some random factor as padding to the original message. The recipient should be able to remove the randoms factor after decryption.

Popular padding schemes:

  • PKCS#1 v1.5: Specified in PKCS#1 v1.5 standard and probably the most popular.
  • OAEP (Optimal Asymmetric Encryption Padding): Specified in PKCS#1 v2.0.
  • PSS (Probabilistic Signature Scheme): similar to OAEP for digital signatures.

Failures

Vulnerabilities that emerge in particular use cases of textbook RSA.

Common Modulus Failure

Encrypt the same message m using the two different encryption keys but with the same modulus n:

c1 = m^e1 mod n
c2 = m^e2 mod n

Let’s assume that with high probability (e1,e2) = 1. An attacker can thus use the EEA to compute x and y such that e1·x + e2·y = 1.

Finally, he can recover the plaintext:

c1' = c1^x = m^(e1·x)
c2' = c2^y = m^(e2·y)
c1'·c2' = m^(e1·x) · m^(e2·y) = m^(e1·x + e2·y) ≡ m (mod n)

Note that if y < 0, set y = -a, with a > 0. Then c2^y = (c2^-1)^aand thus we also require that (c2,n) = 1.

Low Exponent Failure

The public exponent is small and there are three keys that differs only for the modulus

(3, n1), (3, n2), (3, n3)

c1 = m^3 mod n1
c2 = m^3 mod n2
c3 = m^3 mod n3

Assume (ni, nj) = 1. If not we already found a factor for the modulus, and thus we can trivially recover the message m.

The attacker writes a system of equations:

x ≡ c1 (mod n1)
x ≡ c2 (mod n2)
x ≡ c3 (mod n3)

Using the CRT we construct the unique solution x ∈ Zn, with n = n1·n2·n3.

The solution to this system is x = m^3.

Note that m < n1, m < n2, m < n3, thus m^3 < n1·n2·n3 = nm^3 ∈ Zn.

Thus, m is x^-3: the ordinary cubic root of x with no modulus involved.

References