# Breaking Enigma

Created:
๏ฌ
2018-11-07

Updated:
๏ฌ
2023-11-18

During World War II, the Enigma machine presented a formidable challenge to the Allied forces as they struggled to decipher encrypted German messages.

However, a team of brilliant codebreakers, including *Alan Turing*, at Britain’s
Bletchley Park successfully cracked Enigma’s codes, turning the tide of the war.

In this post I’m going to describe the core components of the machine and what are the keypoints that allowed the cryptanalysts to break one of the most famous codes of the history.

## Construction

Main components:

**Keyboard**: allows the operator to input the plaintext by pressing the corresponding keys.**Plugboard**: a series of interchangeable cables that connect pairs of letters.**Rotors**: a set of rotating cipher wheels which are wired together. Each rotor had a unique configuration of wiring inside which cause the input signal to be scrambled before reaching the next rotor.**Reflector**: a specialized rotor that reflected the signal back through the rotors.**Lampboard**: display for the encrypted output. When a key is pressed on the keyboard, the corresponding encrypted letter illuminates on the lampboard.

The Enigma machine had different variations and models, each with a specific configuration. These variations primarily involved the number of rotors and the number of cables used in the plugboard.

In these notes I’ll describe one of the first models, where the configuration allowed to choose 3 out of 5 possible rotors and a plugboard with 6 cables.

Operation Steps:

`x`

: keyboard input given by the operator`๐`

: plugboard constant substitution`๐ผ`

: first rotor substitution`๐ฝ`

: second rotor substitution`๐พ`

: third rotor substitution`๐`

: reflector`๐พโปยน`

: third rotor inverse substitution`๐ฝโปยน`

: second rotor inverse substitution`๐ผโปยน`

: first rotor inverse substitution`๐`

: another plugboard constant substitution- output sent to the lampboard

As a formula:

```
๐(x) = ๐ยท๐ผโปยนยท๐ฝโปยนยท๐พโปยนยท๐ยท๐พยท๐ฝยท๐ผยท๐(x) = ๐ยท๐ยท๐(x)
```

With `๐`

a permutation function that changes after each input character due to
the rotors. In particular:

- 1st rotor advances by one place after each character
- 2nd rotor advances by one place after 26 characters
- 3rd rotor advances by one place after 26ยฒ characters

## Keyspace Size

The operator is allowed to choose:

- Three out of five possible ordered rotors:
`5ยท4ยท3 = 60`

- Rotors initial configuration:
`26ยณ = 17567`

- Plugboard cabling with 6 cables:
`100,391,791,500 โ 10ยนยน`

Total keyspace size is thus `|K| โ 105,869,167,644,240,000 โ 10ยนโท`

.

### Plugboard configurations

We count the ways we can choose 6 different couples from a set of 26 elements:

```
Couple1 = binom(26,2)
Couple2 = binom(24,2)
...
Couple6 = binom(16,2)
TotOrd = Couple1 ยท Couple2 ยท ... ยท Couple6 = 72,282,089,880,000
```

Alternatively, we can first count the ways we can choose 12 elements:

```
N = binom(26, 12)
```

Then the ways to choose different couples from the set of 12 elements:

```
Couple1 = binom(12,2)
Couple2 = binom(10,2)
...
Couple6 = binom(2,2)
TotOrd = N ยท Couple1 ยท Couple2 ยท ... ยท Couple6 = 72,282,089,880,000
```

Finally, since we don’t care about the order of the 6 couples, divide by `6!`

.

```
Tot = TotOrd / 6! = 100,391,791,500
```

## Components Properties

### Reflector

The reflector function `๐`

- is an
**involution**(`๐ยท๐(x) = x`

); - doesn’t have
**fixed points**(`โ x, ๐(x) โ x`

).

### Plugboard

The plugboard function `๐`

- is an
**involution**; - generally
`๐(x) = x`

, except for 12 “unstakered” letters where`๐(x) โ x`

.

### Rotors

The product of rotors function and their inverses is the identity function:

```
๐ผโปยนยท๐ฝโปยนยท๐พโปยนยท๐พยท๐ฝยท๐ผ(x) = x
```

Since the rotors’ configuration changes after each input we identify a rotor `๐พ`

configuration at step `k`

with the notation `๐พโ`

and the configuration of the
whole machine at step `k`

as `๐โ`

```
๐โ(x) = ๐ยท๐ผโโปยนยท๐ฝโโปยนยท๐พโโปยนยท๐ยท๐พโยท๐ฝโยท๐ผโยท๐(x) = ๐ยท๐โยท๐(x)
```

### General properties

Given the properties of the single components:

`๐(x)`

is an involution (`๐ยท๐(x) = x`

)`๐(x)`

doesn’t have fixed points (`โ x, ๐(x) โ x`

)

*Proof*.

Involution property can be trivially proven by expanding `๐ยท๐(x)`

.

Property about fixed points property can be proven by contradiction.

Assume that `๐`

has a fixed point `x`

.
Then `๐ยท๐ผโปยนยท๐ฝโปยนยท๐พโปยนยท๐ยท๐พยท๐ฝยท๐ผยท๐(x) = x`

We apply the inverses on both sides (inverse of `๐โปยน = ๐`

).
Then `๐ยท๐พยท๐ฝยท๐ผยท๐(x) = ๐พยท๐ฝยท๐ผยท๐(x)`

โ `๐(y) = y`

Follows that `๐`

has a fixed point, but this is not possible by construction.

โ

Not having fixed points is a weakness as it allows discarding some possibilities from the keyspace during cryptanalysis.

## Cryptanalysis

This is a **known plaintext** attack.

A world which is assumed to be present in the plaintext is called a **crib**.

There is a trick to speed-up checking if the position of a crib is at least possible: we can discard the possibilities with fixed points.

For example, let’s say that the message is assumed to contain the crib “wettervorhersage”, which is the German word for “weather forecast”.

```
W E T T E R V O R H E R S A G E
Q F Z W R W I V T Y R E S X B F O G K U H Q B A I S E Z
^impossible
W E T T E R V O R H E R S A G E
Q F Z W R W I V T Y R E S X B F O G K U H Q B A I S E Z
^impossible
W E T T E R V O R H E R S A G E
Q F Z W R W I V T Y R E S X B F O G K U H Q B A I S E Z
^impossible
W E T T E R V O R H E R S A G E
Q F Z W R W I V T Y R E S X B F O G K U H Q B A I S E Z
^ impossible
```

One possible encryption:

```
W E T T E R V O R H E R S A G E
Q F Z W R W I V T Y R E S X B F O G K U H Q B A I S E Z
```

The more are the crib’s letters the more is the probability to find the correct offset.

### Initial configuration recovery

The initial configuration is taken as the one where the crib begins:

```
config n. : 1 2 3 4 5 6 . . .
plain (crib) : W E T T E R V O R H E R S A G E
cipher : R W I V T Y R E S X B F O G K U
```

We create a graph with a node for each alphabet letter.

Two nodes are connected by an arch if there is a correspondence in the ciphertext/plaintext. The arch identifier is the configuration number.

```
W โ1โ R, E โ2โ W, T โ3โ I, T โ4โ V, E โ5โ T, ...
```

The correspondence is symmetric because the encryption is an involution. That is, if W in state 1 is mapped to R then R in state 1 would be mapped to W.

The important thing of this graph are the **cycles**.

For example, let’s assume to have found the following cycle (not from the previous example):

```
T โ3โ N โ7โ S โ2โ T
```

This means that if we connect three enigma machines with the configurations equal to the cycle and with the output of each machine connected to the input of the next one (the last output is connected to the first input), then we constructed a circuit.

Since the cipher is an involution, the input/output distinction is blurred and can be ignored.

For example, we know that if the rotors of one machine are in config `๐โ`

then
the machine will map `T`

to `N`

and `N`

to `T`

.

If we connect a lamp to the circuit the lamp should turn on.

(`๐โ`

are the rotors in configuration `k`

)

```
+--(T)---[๐ยท๐โยท๐]---+
| |
| +---------------+
| |
| (N)---[๐ยท๐โยท๐]--+
๐ก |
| +---------------+
| |
| (S)---[๐ยท๐โยท๐]---+
| |
+-------------------+
```

Note that the plugboard configuration `๐`

is constant for the whole circuit and
can be removed. Given the cycle:

```
๐ยท๐แตขยท๐(x) = y and ๐ยท๐โฑผยท๐(y) = x
โ ๐ยท๐โฑผยท๐ยท๐โ
๐แตขยท๐(x) = x
โ ๐ยท๐โฑผยท๐แตขยท๐(x) = x
โ ๐โฑผยท๐แตขยท๐(x) = ๐(x)
โ ๐โฑผยท๐แตข(z) = z
```

Whatever `๐(x) = z`

is, we still have a circuit which is dependent only on the
rotors configuration and not the actual input or the `๐`

function.

Once we detected a cycle we try to find the initial configuration `๐โ`

such that
`๐โ`

, `๐โ`

, `๐โ`

results in a circuit. We try this for all the possible initial
configurations (which are `26ยณ = 17,576`

).

We can exclude all the configurations that don’t close the circuit.

The more circuits we have to test, the fewer spurious configurations are possible.

### Plugboard configuration recovery

Once that we recovered the rotors’ configuration we already recovered all the
letters that are not altered by the plugboard (`๐(x) = x`

), which are 14 out of
26 characters.

By applying the “partial” decryption function to the ciphertext, we end up recovering ~50% of the plaintext. This helps to deduce the missing characters and thus finally recover the full plugboard configuration.