Breaking Enigma

Created: ๏—ฌ 2018-11-07
Updated: ๏—ฌ 2023-11-18

During World War II, the Enigma machine presented a formidable challenge to the Allied forces as they struggled to decipher encrypted German messages.

However, a team of brilliant codebreakers, including Alan Turing, at Britain’s Bletchley Park successfully cracked Enigma’s codes, turning the tide of the war.

In this post I’m going to describe the core components of the machine and what are the keypoints that allowed the cryptanalysts to break one of the most famous codes of the history.

Construction

Main components:

  • Keyboard: allows the operator to input the plaintext by pressing the corresponding keys.
  • Plugboard: a series of interchangeable cables that connect pairs of letters.
  • Rotors: a set of rotating cipher wheels which are wired together. Each rotor had a unique configuration of wiring inside which cause the input signal to be scrambled before reaching the next rotor.
  • Reflector: a specialized rotor that reflected the signal back through the rotors.
  • Lampboard: display for the encrypted output. When a key is pressed on the keyboard, the corresponding encrypted letter illuminates on the lampboard.

The Enigma machine had different variations and models, each with a specific configuration. These variations primarily involved the number of rotors and the number of cables used in the plugboard.

In these notes I’ll describe one of the first models, where the configuration allowed to choose 3 out of 5 possible rotors and a plugboard with 6 cables.

Operation Steps:

  1. x: keyboard input given by the operator
  2. ๐œŽ: plugboard constant substitution
  3. ๐›ผ: first rotor substitution
  4. ๐›ฝ: second rotor substitution
  5. ๐›พ: third rotor substitution
  6. ๐œ‹: reflector
  7. ๐›พโปยน: third rotor inverse substitution
  8. ๐›ฝโปยน: second rotor inverse substitution
  9. ๐›ผโปยน: first rotor inverse substitution
  10. ๐œŽ: another plugboard constant substitution
  11. output sent to the lampboard

As a formula:

๐œŒ(x) = ๐œŽยท๐›ผโปยนยท๐›ฝโปยนยท๐›พโปยนยท๐œ‹ยท๐›พยท๐›ฝยท๐›ผยท๐œŽ(x) = ๐œŽยท๐œยท๐œŽ(x)

With ๐œ a permutation function that changes after each input character due to the rotors. In particular:

  • 1st rotor advances by one place after each character
  • 2nd rotor advances by one place after 26 characters
  • 3rd rotor advances by one place after 26ยฒ characters

Keyspace Size

The operator is allowed to choose:

  • Three out of five possible ordered rotors: 5ยท4ยท3 = 60
  • Rotors initial configuration: 26ยณ = 17567
  • Plugboard cabling with 6 cables: 100,391,791,500 โ‰ˆ 10ยนยน

Total keyspace size is thus |K| โ‰ˆ 105,869,167,644,240,000 โ‰ˆ 10ยนโท.

Plugboard configurations

We count the ways we can choose 6 different couples from a set of 26 elements:

Couple1 = binom(26,2)
Couple2 = binom(24,2)
...
Couple6 = binom(16,2)

TotOrd = Couple1 ยท Couple2 ยท ... ยท Couple6 = 72,282,089,880,000

Alternatively, we can first count the ways we can choose 12 elements:

N = binom(26, 12)

Then the ways to choose different couples from the set of 12 elements:

Couple1 = binom(12,2)
Couple2 = binom(10,2)
...
Couple6 = binom(2,2)

TotOrd = N ยท Couple1 ยท Couple2 ยท ... ยท Couple6 = 72,282,089,880,000

Finally, since we don’t care about the order of the 6 couples, divide by 6!.

Tot = TotOrd / 6! = 100,391,791,500

Components Properties

Reflector

The reflector function ๐œ‹

  • is an involution (๐œ‹ยท๐œ‹(x) = x);
  • doesn’t have fixed points (โˆ€ x, ๐œ‹(x) โ‰  x).

Plugboard

The plugboard function ๐œŽ

  • is an involution;
  • generally ๐œŽ(x) = x, except for 12 “unstakered” letters where ๐œŽ(x) โ‰  x.

Rotors

The product of rotors function and their inverses is the identity function:

๐›ผโปยนยท๐›ฝโปยนยท๐›พโปยนยท๐›พยท๐›ฝยท๐›ผ(x) = x

Since the rotors’ configuration changes after each input we identify a rotor ๐›พ configuration at step k with the notation ๐›พโ‚– and the configuration of the whole machine at step k as ๐œŒโ‚–

๐œŒโ‚–(x) = ๐œŽยท๐›ผโ‚–โปยนยท๐›ฝโ‚–โปยนยท๐›พโ‚–โปยนยท๐œ‹ยท๐›พโ‚–ยท๐›ฝโ‚–ยท๐›ผโ‚–ยท๐œŽ(x) = ๐œŽยท๐œโ‚–ยท๐œŽ(x)

General properties

Given the properties of the single components:

  • ๐œŒ(x) is an involution (๐œŒยท๐œŒ(x) = x)
  • ๐œŒ(x) doesn’t have fixed points (โˆ€ x, ๐œŒ(x) โ‰  x)

Proof.

Involution property can be trivially proven by expanding ๐œŒยท๐œŒ(x).

Property about fixed points property can be proven by contradiction.

Assume that ๐œŒ has a fixed point x. Then ๐œŽยท๐›ผโปยนยท๐›ฝโปยนยท๐›พโปยนยท๐œ‹ยท๐›พยท๐›ฝยท๐›ผยท๐œŽ(x) = x

We apply the inverses on both sides (inverse of ๐œŽโปยน = ๐œŽ). Then ๐œ‹ยท๐›พยท๐›ฝยท๐›ผยท๐œŽ(x) = ๐›พยท๐›ฝยท๐›ผยท๐œŽ(x) โ†’ ๐œ‹(y) = y

Follows that ๐œ‹ has a fixed point, but this is not possible by construction.

โˆŽ

Not having fixed points is a weakness as it allows discarding some possibilities from the keyspace during cryptanalysis.

Cryptanalysis

This is a known plaintext attack.

A world which is assumed to be present in the plaintext is called a crib.

There is a trick to speed-up checking if the position of a crib is at least possible: we can discard the possibilities with fixed points.

For example, let’s say that the message is assumed to contain the crib “wettervorhersage”, which is the German word for “weather forecast”.

W E T T E R V O R H E R S A G E
Q F Z W R W I V T Y R E S X B F O G K U H Q B A I S E Z
                        ^impossible

  W E T T E R V O R H E R S A G E
Q F Z W R W I V T Y R E S X B F O G K U H Q B A I S E Z
              ^impossible

    W E T T E R V O R H E R S A G E
Q F Z W R W I V T Y R E S X B F O G K U H Q B A I S E Z
                    ^impossible

      W E T T E R V O R H E R S A G E
Q F Z W R W I V T Y R E S X B F O G K U H Q B A I S E Z
      ^ impossible

One possible encryption:

        W E T T E R V O R H E R S A G E
Q F Z W R W I V T Y R E S X B F O G K U H Q B A I S E Z

The more are the crib’s letters the more is the probability to find the correct offset.

Initial configuration recovery

The initial configuration is taken as the one where the crib begins:

config n.    : 1 2 3 4 5 6 . . .
plain (crib) : W E T T E R V O R H E R S A G E
cipher       : R W I V T Y R E S X B F O G K U

We create a graph with a node for each alphabet letter.

Two nodes are connected by an arch if there is a correspondence in the ciphertext/plaintext. The arch identifier is the configuration number.

W โ†1โ†’ R, E โ†2โ†’ W, T โ†3โ†’ I, T โ†4โ†’ V, E โ†5โ†’ T, ...

The correspondence is symmetric because the encryption is an involution. That is, if W in state 1 is mapped to R then R in state 1 would be mapped to W.

The important thing of this graph are the cycles.

For example, let’s assume to have found the following cycle (not from the previous example):

T โ†3โ†’ N โ†7โ†’ S โ†2โ†’ T

This means that if we connect three enigma machines with the configurations equal to the cycle and with the output of each machine connected to the input of the next one (the last output is connected to the first input), then we constructed a circuit.

Since the cipher is an involution, the input/output distinction is blurred and can be ignored.

For example, we know that if the rotors of one machine are in config ๐œโ‚ƒ then the machine will map T to N and N to T.

If we connect a lamp to the circuit the lamp should turn on.

(๐œโ‚– are the rotors in configuration k)

 +--(T)---[๐œŽยท๐œโ‚ƒยท๐œŽ]---+
 |                   |
 |   +---------------+
 |   |
 |  (N)---[๐œŽยท๐œโ‚‡ยท๐œŽ]--+
๐Ÿ’ก                   |
 |   +---------------+
 |   |
 |  (S)---[๐œŽยท๐œโ‚‚ยท๐œŽ]---+
 |                   |
 +-------------------+

Note that the plugboard configuration ๐œŽ is constant for the whole circuit and can be removed. Given the cycle:

๐œŽยท๐œแตขยท๐œŽ(x) = y  and  ๐œŽยท๐œโฑผยท๐œŽ(y) = x

โ†’ ๐œŽยท๐œโฑผยท๐œŽยท๐œŽโ‹…๐œแตขยท๐œŽ(x) = x
โ†’ ๐œŽยท๐œโฑผยท๐œแตขยท๐œŽ(x) = x
โ†’ ๐œโฑผยท๐œแตขยท๐œŽ(x) = ๐œŽ(x)
โ†’ ๐œโฑผยท๐œแตข(z) = z

Whatever ๐œŽ(x) = z is, we still have a circuit which is dependent only on the rotors configuration and not the actual input or the ๐œŽ function.

Once we detected a cycle we try to find the initial configuration ๐œโ‚ such that ๐œโ‚ƒ, ๐œโ‚‡, ๐œโ‚‚ results in a circuit. We try this for all the possible initial configurations (which are 26ยณ = 17,576).

We can exclude all the configurations that don’t close the circuit.

The more circuits we have to test, the fewer spurious configurations are possible.

Plugboard configuration recovery

Once that we recovered the rotors’ configuration we already recovered all the letters that are not altered by the plugboard (๐œŽ(x) = x), which are 14 out of 26 characters.

By applying the “partial” decryption function to the ciphertext, we end up recovering ~50% of the plaintext. This helps to deduce the missing characters and thus finally recover the full plugboard configuration.

References