## D   A   T   A   W   O   K

Creation: January 16 2023
Modified: January 17 2023

# Random Primes Generation

We want to develop an algorithm that generates a prime in the range [2^(k-1), 2^k), with k the number of bits to represent the prime.

Approach:

1. random generation of a k-bits integer n in the interval
2. primality check

During the random generation we can observe that for sure the most significant and least significant bits are set to 1. The intermediate positions are randomly filled.

## Termination

The algorithms terminates and we can give an upper bound to the iterations.

``π(n) = number of primes between [1..n]``

There is a theorem that asserts that:

``π(n) ≈ n/ln(n).``

In other words the probability to choose a prime in the set of the first n numbers is:

``π(n)/n ≈ 1/ln(n)``

This probability decreases, but is always greater than 0.

With our "random" algorithm the probability to get a 2^k bits prime on first step is:

``π(2^k)/(2^k) ≈ 1/ln(2^k) = 1/[k*ln(2)]``

More exactly, since we are ignoring all even numbers, is:

``π(2^k)/[2^(k-1)] ≈ 1/ln[2^(k-1)] = 1/[(k-1)*ln(2)]``

Note that in order to simplify a bit, we are intentionally ignoring the fact that we don't pick the numbers in the range [0..2^(k-1)).

The event to extract a prime can be associated to a Bernoulli's variable that has probability of success Pr = 1/[(k-1)*ln(2)].

The average number of attempts before success is thus 1/Pr = (k-1)*ln(2)

``````k     | ≈ attempts
------|-------------
10    | 6
100   | 68
1000  | 692
10000 | 6930``````

## Primality Test

A brute force algorithm to check if a number n is prime is an exhaustive search of a prime factor up to √n.

A composite number should have at least one divisors x less than or equal √n. If this is not the case then it should have at least two divisors x and y greater than √n and follows that

``n ≥ xy > √n√n = n  →  n > n  →  impossible``

More efficient algorithms to check for primality are probabilistic (aka Montecarlo methods).

Given a probabilistic primality test algorithm A:

``A(n): true if n is probably prime, false otherwise``

If case A returns true the error probability ε (i.e. n is instead composite) is is directly related to the algorithm we've used.

If n is probably prime then the algorithm A can be re-executed and the more it returns true the more likely is that n is indeed prime. If during the iterations A returns false once then we can immediately stop and assert that the number is composite.

The algorithm "fails" if it returns true for a composite number.

``Pr(n is composite | A(n) = true) = ε``

The probability that the algorithm fails k times is a row is ε^k.

### Naive Approach

Extract a random number x in [2, n-1] and compute gcd(x, n):

``A(n): if gcd(x, n) = 1 return true else false``

A witness is a number that allow the algorithm A to decide that n is not prime.

In this case ε, i.e. the probability that gcd(x, n) = 1 when n is not prime, is equal to the prob of choosing x coprime with n, i.e. in Zn, with |Zn| = φ(n). Thus:

``ε ≈ φ(n)/(n-2) = n⋅∏(1 - 1/pi)/(n-2) ≈ ∏(1 - 1/pi)``

With pi the primes in the factorization of n.

The more ε is small the more is high the probability to find a witness.

If the prime factors are all huge then ε ≈ 1.

This means that even though n is not prime it is very unlikely to choose a number (a witness) that is not coprime with n.

In case of RSA we have two big p and q, so given n = p·q there is a very low probability we end up choosing p or q during our primality test. In this case the probability is 2/n, i.e. close to zero.

Reformulating the problem, with this test we can't fix an error probability ε below an upper bound < 1.

Probabilistically ε = φ(n)/n ≈ 1, that is φ(n) is typically very close to n. Follows that we have to find a better algorithm to find a witness.

### Fermat Test

Extends the naive approach by applying the Fermat Theorem to a candidate.

Small Fermat Theorem

Given a prime p and a number x such that (x, p) = 1, then x^(p-1) = 1 (mod p)

The idea is, given a candidate number n, to randomly choose 1 < x < n-1 and compute x^(n-1) mod n. If the result is not 1 then for sure n is not prime, thus we return false.

The result follows because if n is prime and x < n then for sure (x, n) = 1 for the Fermat's theorem.

Note that we are not going to test 1 and (n-1) bases as the Fermat expression returns 1 for all n and for all odd n, respectively. Thus doesn't add any value to test such numbers. That is:

``````* 1^(n-1) = 1 (mod n)   for all n
* if n is odd  →  n-1 is even  →  n-1 = 2k for some k
→  (n-1)^(n-1) mod n = (-1)^2k mod n = 1``````

If x^(n-1) = 1 (mod n), then x may be prime and the error probability ε is:

``prob of choosing x such that (x, n) = 1 AND x^[(n-1)%φ(n)] = 1 (mod n)``

As we'll see there is no upper bound to this probability as for Carmichael's numbers this is always 1.

Fermat Witness. x in Zn with 1 < x < n and x^(n-1) ≠ 1 (mod n).

Because the number of x such that (x,n)=1 is generally close to n, our algorithm must perform well in Zn. That is, in general we're going to search for a witnesses in Zn.

Generally, composite numbers have a fairly high number of witnesses but unfortunately there are some composite numbers that doesn't have any witness.

Note that if (x,n) ≠ 1 then x^(n-1) ≠ 1 (mod n). Proof: x not invertible mod n then x⋅x^(n-2) = 1 (mod n) doesn't have a solution

#### Carmichael Numbers

A composite number n which satisfies x^(n-1) = 1 (mod n) for all integers x which are relative prime to n.

In other words, excluded the numbers that are not relative prime to n, n doesn't have any Fermat witness.

Note that for Euler's theorem given (n,x)=1 then x^φ(n) = 1 (mod n), but here we have that the equation is true also for n-1 at the exponent as if n is prime.

A Carmichael number is thus a number for which the Fermat's theorem holds true even though is not prime.

Theorem. There are Carmichael numbers.

Proof (by construction)

``  561 = 3*11*17 is the first Carmichael number.``

For all x, such that (x, 561) = 1, then x^560 mod 561 = 1.

Using the Little Fermat theorem corollary, for each prime factors pi of 561 we have:

``````  x^560 mod pi = x^(560 mod (pi-1)) mod pi
= x^[(3·11·17 - 1) mod (pi-1)] mod pi
= 1 (... prove this by doing the math)``````

Thus, using the CRT, x^560 can be represented as (1, 1, 1). The solution is thus unique and is x^560 = 1 (mod 561).

Given that the majority of numbers less than n are relative prime to n, follows that is very hard to find a witness for Carmichael numbers. That is, we have to be lucky enough to pick an x such that (x,n) ≠ 1.

Carmichael numbers can be tabulated up to an upper bound. But this is limited and indeed there are small pre-built tables.

Note that if we are unlucky to test for primality a Carmichael number since is very probable that we will find ourself mostly checking random numbers in Zn*. The Fermat test will always return true for all of them, thus is completely wasted time to perform the Fermat test for them.

### Miller-Rabin Test

Extends the Fermat test by appliying the quadratic reminders property to a candidate.

Theorem QR. Given a prime p and x < p. Then x^2 = 1 (mod p) iff x = ±1 (mod p)

Proof: (←) if x of ±1 then obviously x^2 = 1 (→) if x^2 = 1 (mod p) → (x^2 - 1) = 0 (mod p) → (x - 1)(x + 1) = 0 (mod p) → p | (x - 1)(x + 1) -(because p is prime)→ p | (x - 1) or p | (x + 1) → if p | (x - 1) → x - 1 = 0 (mod p) → x = +1 (mod p) → if p | (x + 1) → x + 1 = 0 (mod p) → x = -1 (mod p)

The theorem says that if p is prime, x is its self inverse only iff x = ±1.

Reformulating the theorem for our primality test: if x ≠ ±1 and x^2 = 1 (mod p) then p is not prime.

Example

``````Z8 = {0, 1, 2, 3, 4, 5, 6, 7}
The number 3^2 = 1 (mod 8) and 3 ≠ ±1 (mod 8)
Thus if 8 can't be prime.``````

Note, not all numbers that respect the theorem are primes. For example in Z10 the square roots are 1 and 9 and the property hold. But 10 is not prime.

We search for witness in a smarter way.

``A(x, n) = true iff Miller-Rabin predicate returns true``

### Miller-Rabin predicate.

Given the decomposition of an odd prime candidate n

``n - 1 = m⋅2^r``

Where m is odd.

(In practice we take out all the 2 factors from the factorization of n-1).

We compute a sequence {xi} for i=i..r x0 = x^m mod n x1 = x0^2 mod n = x^(m⋅2) mod n x2 = x1^2 mod n = x^(m⋅4) mod n = x^(m⋅2^2) mod n x3 = x2^2 mod n = x^(m⋅8) mod n = x^(m⋅2^3) mod n ... xr = x^(m⋅2^r) mod n = x^(n-1) mod n

Note that the last element of the sequence (xr) is equal to x^(n-1). and if n is prime then for sure we have that xr = x^(n-1) = 1.

If n is prime, then for the theorem QR we have that if xi = 1 then x(i-1) = ±1.

If x0 is 1 then we can immediately return that it is probably prime. Indeed if x0 = ±1 then for any i > 0 we have xi = 1. But none of the xi values can be used as a witness since the previous one was already ±1. Furthermore xr also doesn't break the Fermat theorem (x^r = x^(n-1) = 1 mod n).

Because the case where we start with a 1 has been early addressed at x0, then if at some point we have xi = 1 then it should be that x(i-1) = -1 (again for theorem QR).

The predicate thus returns false (composite) if:

``x0 ≠ 1 and ∀i in [0,1,..,r-1] xi ≠ -1``

As usual the error probability ε is the probability to choose x that is not a Rabin witness for a composite number n.

Theorem. Every odd composite number n has at least (3/4)⋅n Rabin witnesses. Thus the probability of not choosing a witness x during the test ε < n/4.

Note. there is a companion theorem, easier to prove, that shows that there are at least n/2 Rabin witness.

The algorithm gives a proof of compositeness as well (if required). Represented by the sequence x0...xk

### Factorization attempt

The factorization problem is an independent problem.

Thus primality testing doesn't help a lot.

There is an exception anyway, where the primality test helps with factorization.

If I find a non trivial square of 1 modulo n. In this case we know for The QR that n can't be prime

``````x ≠ ±1
x^2 = 1 (mod n) → x^2-1 = (x-1)(x+1) = 0 (mod n)
→ n | (x-1)(x+1)``````

The factors of n should thus be in the product of (x-1)(x+1). They cannot all be in the first factor, otherwise n | (x-1) → x = 1 (mod n) And this can't be true for hypothesis. It can't be neither that n| (x + 1) → x = -1 (mod n)

This means that some factors of n are in (x-1) and some others are in (x+1).

Thus (n, x-1) should give us a non trivial factor of n. The same holds for (n, x+1).

The number of Carmichael are the ones that are vulnerable. Since are the ones that are catched by the Theorem QR of Miller Rabin.

Quadratic Sieving algorithm is a factorization method that tries to leverage this weakness.

### RSA Note

Can be proven that Carmichael numbers should have at least three elements, so RSA is not affected.

### References

• Fast Rust implementation using `num-bigint` and `rayon` as a gist

Proudly self-hosted on a cheap Raspberry Pi