Permutations and Letter Chains

Section 4.4 Permutations and Letter Chains

With the development of the Enigma machine and other cryptographic machines, we begin to see more mathematicians being recruited into cryptology. Some of the earliest work on attacking the Enigma machine occurred at the Polish Cipher Bureau (Biuro Szyfrów, pronunciation from Wikipedia). They recruited mathematicians to work on cryptology and attack the Enigma machine. The three most famous ones are Marian Rejewski (pronunciation from Wikipedia), Jerzy Różycki (pronunciation from Wikipedia), and Henryk Zygalski (pronunciation from Wikipedia).

Figure 4.4.1. A Polish stamp honoring Rejewski, Różycki, Zygalski

The Polish mathematicians were initially aided in their work by materials that French officers received from a German defector who provided documents that described how the Enigma machine worked (minus the actual wiring of the rotors) and a code book of settings for the Enigma machine for a certain period of time.

Recall that the Germans used extensive code books which specified the initial settings of the machine. At this time, these included which rotors were used and in which order they were placed into the machine, the ring setting for each rotor and the indicator setting for each rotor, and the plugboard settings. (Later the indicator setting would be transmitted differently.)

It is not cryptographically secure to send many messages using the exact same key. To avoid sending all the messages for a single day using the same key, each Enigma operator would additionally select a three-letter message key. Initially, this three-letter key would be repeated to form a six-letter message (e.g. CKWCKW) which would be encrypted with the day settings and transmitted. Then the initial settings of the rotors would be changed to the message settings (e.g., CKW) and the message would be transmitted. The message key was repeated to allow the receiver of the message to check for garbles that may have occurred in transmission. However, this “repeated encryption” of the message key allowed Rejewski to use theorems about permutations to solve for the wiring of the rotors and find the daily settings.

Investigation Time!

Time for you to explore the idea of Rejewski's work in a simple case in Investigation: Letter Chains.

From Investigation: Letter Chains you should have gotten the idea of how to use letter chains to get information about the daily settings with a small alphabet. To see how this generalizes to the full alphabet we return to the paper enigma. It is rather slow to encrypt lots of messages with the paper enigma so we will use a simulator in Sage Computation 4.4.2.

Sage Computation 4.4.2. Paper Enigma Simulator.

We will construct a set of double encrypted message keys and show how the ciphertext from these encrypted message keys could be used to recover information about the Enigma machine. We select twenty six sets of three letters, repeat them, and then encrypt them with the paper enigma or paper enigma simulator with the settings of rotor I, II, III (in that order from left to right) and the initial rotor positions of MCK. The sets are selected so that every letter in the alphabet occurs once in each spot.

Table 4.4.3. Encryption of Repeated Message Keys

PT	CT	PT	CT	PT	CT
ANTANT	TMLHSM	BOXBOX	VVSLBS	CAPCAP	WRFDIL
DRYDRY	HAQCFE	ELKELK	QEBPZV	FIGFIG	KFDWAH
GTIGTI	NXJIUC	HBOHBO	DSAAOA	IJNIJN	OUCGGZ
JKLJKL	MGTKCP	KGBKGB	FKKJJF	LMHLMH	ZNMBDG
MSUMSU	JBVQNR	NEWNEW	GLRUHQ	ODDODD	IQGSMJ
PFFPFF	UIPERB	QQQQQQ	EDYMWW	RCARCA	YZOTKO
SUMSUM	XJHOTT	THETHE	APZREY	UPSUPS	PHXNVX
VVVVVV	BOUYPK	WWJWWJ	CYIFQD	XYZXYZ	SWEZXN
YZRYZR	RCWVLU	ZXCZXC	LTNXYI

We now construct letters chains between the first and the fourth letter of the ciphertext. We use the first and the fourth letters because they were encrypted with the same letter in the underlying plaintext. We'll see why that's important later. The first ciphertext is TMLHSM. The first letter is T and the fourth letter is H, so we start our letter chain with (TH). Then we find a ciphertext that starts with H. The ciphertext corresponding to DRYDRY is HAQCFE. We now extend our letter chain with the first and fourth letters here. That is, the letter chain is now (THC). We next use ciphertext CYIFQD to continue the letter chain to (THCF). Repeating this process we obtain (THCFJQPNISZBY). Note that cipher text YZOTKO links back to the starting letter of our chain so we close the chain. We haven't used the entire alphabet yet, so we start with a letter of the alphabet not in the first letter chain, such as A. We use the ciphertext APZREY to start our new letter chain (AR). Continuing through the same process we obtain (ARVLXOGUEMKWD). Thus, from the first and fourth letter chains for this setting we get two chains of length 13, namely

(ARVLXOGUEMKWD)(BYTHCFJQPNISZ).

Similarly, we can find the letter chains for second and fifth letters to get

(MSOPEZKJTYQ) (AF) (BNDWXUGCLHV) (IR).

The letter chains for third and sixth letters are

(A) (BVRQENIDHTP) (CZYWUKFLMGJ) (O) (S) (X).

We can use Table 4.4.3 to construct permutations given by this initial setting I:M,II:C,III:K. We will call \(P_1:PT \to CT\) the permutation that describes how each of the first letters in the plaintext messages are enciphered. For example, the plaintext letter A is enciphered as T, so \(P_1(A)=T\text{.}\) It is helpful to remember that the Enigma machine is constructed so that encrypting and decrypting is the same. Thus, if \(P_1(A)=T\) then we also know \(P_1(T)=A\text{.}\)

Table 4.4.4. The Permutation Table for \(P_1\)

A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z

T
V
W
H
Q
K
N
D
O
M
F
Z
J
G
I
U
E
Y
X
A
P
B
C
S
R
L

We will also construct the permutation \(P_4\) which describes how each of the fourth letters in the plaintext are enciphered.

Table 4.4.5. The Permutation Table for \(P_4\)

A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z

H
L
D
C
P
W
I
A
G
K
J
B
Q
U
S
E
M
T
O
R
N
Y
F
Z
V
X

The Polish mathematicians did not have this information, but we will see how they were able to discover information about these permutations based solely on the encrypted message keys and corresponding letter chains. To do this we next create the composition of permutations \(P_4\) and \(P_1\text{.}\) We first apply the permutation \(P_1\) to each letter and then apply the permutation \(P_4\) to the output letter from \(P_1\text{.}\) For example,

\begin{equation*} P_{4}(P_1)(A)=P_4(P_1(A))=P_4(T)=R \;\; \text{ and} \end{equation*}

\begin{equation*} P_{4}(P_1)(B)=P_4(P_1(B))=P_4(V)=Y\text{.} \end{equation*}

Note that

\begin{equation*} P_{4}(P_1)(R)=P_4(P_1(R))=P_4(Y)=V \end{equation*}

so that the reciprocal property no longer holds. This gives

Table 4.4.6. The Permutation Table for \(P_{4}(P_1)\)

A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z

R
Y
F
A
M
J
U
C
S
Q
W
X
K
I
G
N
P
V
Z
H
E
L
D
O
T
B

Note that typically we write composition of functions right to left \(P_{4}(P_1)=P_4\circ P_1=P_4 P_1\text{.}\) But this is a notational choice and Rejewski wrote his composition of functions in the opposite order, from left to right. So if you are interested in reading any of Rejewski's historical documents you'll note that all his permutations are written in a different order.

We next write the composition permutation \(P_{4}(P_1)\) in cycle notation. \(P_{4}(P_1)=\)(ARVLXOGUEMKWD)(BYTHCFJQPNISZ) and we should notice one of the important results that Rejewski used to crack the Enigma machine. We see that \(P_{4}(P_1)\) is given by the letter chains from the ciphertext messages!

Why is this true? It has to do with the \(P_1\) and \(P_4\) acting on the same plaintext letter and the fact that the encryption and decryption process are the same. Let \(PT_1=PT_4\) be the first letter of the plaintext which is the same as the fourth letter of the plaintext, \(CT_1\) be the first letter of the ciphertext, and \(CT_4\) be the fourth letter of the ciphertext. Then

\begin{equation*} P_1(PT_1)=CT_1 \text{ and } P_1(CT_1)=PT_1\\ P_4(PT_4)=CT_4 \text{ and } P_4(CT_4)=PT_4. \end{equation*}

Thus,

\begin{equation*} P_4(P_1)(CT_1)=P_4(P_1(CT_1))=P_4(PT_1)=P_4(PT_4)=CT_4\text{.} \end{equation*}

Thus we see that the composition permutation \(P_{4}(P_1)\) gives the letter chain where \(CT_1\) goes to \(CT_4\text{.}\)

The more impressive thing is that Rejewski discovered he could work backwards. That is, he could use the letter chains to find the permutations \(P_1\) and \(P_4\text{.}\) (Almost -- there's still some guess and check work left to do.)

\(P_1= \qquad\)
A
R
V
L
X
O
G
U
E
M
K
W
D
\(\qquad\)
(=first letter chain)

T
Y
B
Z
S
I
N
P
Q
J
F
C
H
\(\qquad\)
(=second letter chain in reverse order)

The magical thing is that this will always work except for finding where to start writing the cycle on the bottom. There are only 13 possibilities to check though and evidently Rejewski was aided by Enigma operators not making entirely random choices for the three letters message keys. Often they would just use the same letter and pick JJJ for their three letter message key.

Rejewski's theorems on permutations are summarized in the theorem below.

Theorem 4.4.7.

Rejewski's Idea: The cycle structure of the composition \(P_{4}(P_1)\) will always be pairs of cycles that can be matched up as above to produce \(P_4\) and \(P_1\text{.}\)

This theorem is actually true for any composition of permutations where each permutation is a product of transpositions (swaps of two letters).

Rejewski repeated the same process for 2nd and 5th ciphertext letter chains and 3rd and 6th ciphertext letter chains. He used the information for \(P_1, P_2, P_3, P_4, P_5, P_6\) together with the fact that \(P_{i+1}\) should be related to \(P_i\) by a rotation of the right rotor only. This process is based on the likelihood that the middle and left rotor will not move in the encryption of just six letters for most setting. (This process fails if the middle or left rotor moves during these first six letters.) This allowed him to cancel out the action of the left and middle rotor and focus on just the right rotor. The information from the German spy gave him the settings for machine; in particular the settings for the plugboards were known so he could also cancel those out. All of that information was combined to deduce the wirings for the rightmost rotor. He then had to wait for another day when a different rotor was in the rightmost spot to solve for the wirings of that rotor.

Once the wirings were known, the Polish Cipher Bureau could read German Enigma messages since they had the captured information about daily settings. However, once the daily setting information from the German spy ran out, they could also use the letter chains from the ciphertext of the repeated message keys to determine the initial settings. They built a machine, called a cyclometer, to construct the letter chains for all \(105,456=6 \times 26^3\) initial settings of the machine. (At that time, there were only three rotors to choose from, each with 26 possible starting positions.) Essentially the cyclometer connected simulations of two Enigma machines scrambling units, where the second one is stepped forward three positions from the first one to recreate the first to fourth letter chain in the ciphertext. When the current was active, it would light up all of the letters in a cycle or complimentary cycle for a letter chain at a given setting. The important point is that although the plugboards will change the specific letters in the letter chain cycles, they don't change the lengths of cycles. Thus, the Polish mathematics created a catalog of the lengths of all the cycles for each setting. Creating the catalog took a long time, but once it was ready daily keys could be obtained within 10-20 minutes.

While these letter chains and associated permutation theory were put to good use by Rejewski, Różycki, and Zygalski for breaking Enigma messages, as German protocol for encryption techniques changed, they developed many other techniques as well. A grill and clock method were useful early on for determining wheel settings and orders. When the protocol changed and repeated encryptions were no longer sent on daily indicator settings but on individually chosen operator settings, the catalog method no longer worked. An adaption of the catalog method which focused on which settings would allow letter chains of just a single letter was developed. Two different techniques were developed to do this. Rejewski built an electro-mechanical machine called the “bomba kryptologiczna” (cryptologic bomb) which simulated six Enigma machines (essentially three cyclometers). The bomba depended on relatively few letters being steckered to function. Zygalski created a method of perforated sheets to search for these settings. This technique was not dependent on the plugboard settings and was eventually used quite effectively. Construction of appropriate sets of perforated sheets took a long time. As the political situation became more dire, this important work of the Polish cryptographers was shared with the British and French and was very important to the work of the Bletchley Park codebreakers which we will discuss in the next section.

We conclude this section with a brief historical timeline about the Polish work.

1929. Twenty students including Rejewski, Różycki, Zygalski recruited for a cryptography class at the University of Poznan.
1929-1930. Rejewski, Różycki, Zygalski start working at the Polish Cipher Bureau.
1932. Hans-Thilo Schmidt (code name Asche) sells instructions for the German army Enigma machine to the French who share the information with the Polish. These materials include daily settings for September 1932 and October 1932.
1932. Rejewski used the repeated encryption of message keys and the information from the German defector to find the wirings of two rotors. The wheel orderings were constant for September and constant (but different) for October. He then used other methods to deduce the wiring of the last rotor and the reflector.
1933. Różycki and Zygalski joined Rejewski to work on breaking Enigma messages.
1933-35. Used grill method and clock method to determine orders and settings of wheels.
1936. The number of cables used in the plugboard changed from 6 to vary anywhere from 5-8. Cyclometer built to construct catalog of cycle lengths of letter chains.
November 1937. The reflector was changed in the Enigma machine from Reflector A to Reflector B and the catalog had to be reconstructed.
September 1938. Changed the indicator protocol. No longer sent the repeated encryptions of message keys using the daily setting. Each operator chose an indicator setting which was sent in the clear as part of the message preamble. The Enigma machines were reset to this indicator setting on the rotors and then the message setting was encrypted with the repeated encryption. Catalogs no longer worked.
December 1938. Two more rotors were added to the set of rotors. Previous methods were used to determine the wiring (on a channel that was still using the old indicator system.) Zygalski sheets and Rejewski's “bomba kryptologiczna” (cryptologic bomb) were developed. Political situation was deteriorating and invasion seemed imminent.
July 1939: The Polish work is shared with the British and French at a meeting near Warsaw. (Pyry)
September 1939. Germany invades Poland. Britain and France declare war on Germany.
October 1939. Polish mathematicians escape to France. They continue work on Enigma from Vichy France and North Africa.
1940. Complete sets of perforated sheets are constructed at Bletchley Park and a set is given to Rejewski, Różycki, Zygalski in France. These are used to break many Enigma messages (both in France and at Bletchley Park.)
January 1942. Rozycki dies traveling between North Africa and France. (The Lamoriciere is sunk.)
July 1943. Rejewski and Zygalski escape to Britain and work at a Polish Signals Battalion in Boxmoor near London.

Appendix 4.4.A Rejewski's Theorems

For completeness, we include more precise statements of Rejewski's theorems.

Theorem 4.4.8.

If two permutations of the same degree consist only of disjoint transpositions, then their product contains an even number of disjoint cycles of the same length.

Theorem 4.4.9.

If in any permutation of even degree there appears an even number of disjoint cycles of the same length, then the permutation can be regarded as a product of two permutations each of which consists only of disjoint transpositions.

Theorem 4.4.10.

Letters entering into one and the same transposition of permutation X or Y, enter always into two different cycles of the permutation XY.

Theorem 4.4.11.

If two letters found in two different cycles of the same length of the permutation XY belong to the same transposition, then the letters adjacent to them (one to the right, the other to the left) also belong to the same transposition.

Theorem 4.4.12.

If \(H(i) = j\text{;}\) i.e., \(H =( \cdots i j \cdots )\text{;}\) then \(T^{-1}HT = (\cdots T(i)T(j) \cdots)\text{.}\) Notice that this implies that \(H = (\cdots i j \cdots )\) and \(T^{-1}HT = (\cdots T(i)T(j) \cdots)\) have the same disjoint cycle decomposition.

References References

[1]

Chris Christensen, “Polish Mathematicians Finding Patterns in Enigma Messages”, Mathematics Magazine, Vol 80, No 4, October 2007.

[2]

Marian Rejewski, “An Application of the Theory of Permutations in Breaking the Enigma Cipher”, Applicaciones Mathematicae, 16, number 4, (1980), 543–559.

[3]

Gordon Welchman, “From Polish Bomba to British Bombe: The Birth of Ultra”, Intelligence and National Security, Vol 1, No. 1. (1986), 71-110.