In The Imitation Game: Part 1 we looked at the early work of Alan Turing about universal computing machines, the limits of what computers can do, and whether computers could ever successfully imitate human brains. In this post we look at Turing’s work doing codebreaking at Bletchley Park during the Second World War, and the similarities and differences with codebreaking today. The material in this post is based on a talk given by Dr Tom Leinster from the School of Mathematics at the University of Edinburgh, as part of an event called The Maths of the Imitation Game at the Filmhouse cinema.
Bletchley Park, otherwise known as Station X, was a mansion in Milton Keynes that was used during World War 2 as the base for the Government Code and and Cypher School (GC&CS). (Earlier this year it was reopened as a museum, and I’m told it’s very good!) One of its main purposes was to decipher German messages, which were being encrypted using a device called the Enigma machine. It was thought that Enigma was unbreakable, and so it might have been were it not for Turing’s brilliance and subtle mistakes made by the German operators.
In a simple substitution cipher, each letter of the alphabet is replaced by another letter of the alphabet. For example, if we had S->F, H->U, E->R and P->Y, the word SHEEP would be encoded as FURRY. Such a code is easily breakable using frequency analysis: if E is the most common letter in the English language and R is the most common letter in the message, it’s likely that E has been encoded as R. The Enigma machine is much cleverer than that.
In a standard Enigma machine there are 3 rotors. Each rotor has each of the 26 letters of the alphabet inscribed around it, and each is set to an arbitrary position at the beginning of the day. When the operator types a letter on the keyboard, the signal from that key is sent through wiring to the first rotor, which encodes it as a new letter. It is then sent through wiring to the second rotor, which changes it again, and finally it is sent to the third rotor, which changes it again. The signal then goes around some fixed wiring (called a reflector) and then it returns through each of the rotors, finally lighting up a new letter on the lampboard which the operator writes down. Here’s the clever bit: after each new letter is entered, the rotors turn. Therefore, if you had typed S the first time and got an F, you could type S again and get a Z. The same letter is encoded differently every time it is pressed. The message itself is part of the encoding.
There are 3 different rotors, which the operator could choose out of a possible 5. This gives 5x4x3=60 options already for the initial setup. Then, each rotor can start out in any one of 26 different positions. This gives 26x26x26=17,576 options. So far, this is only about a million combinations. It sounds like a lot, but this is at a level where, with a little ingenuity, you could simply brute force all the possibilities. To make Enigma unbreakable a final layer of encryption was added: the plugboard. This used cables to pair up letters. For example, if E was paired with Q, then if E was typed then the machine would interpret this as a Q before transmitting it through the rotors. With this additional scrambling, the number of combinations of an Engima machine was over 159 million million million. There was no way they could ever hope to check all possible settings in a reasonable amount of time. And, to make matters worse, the Germans changed their settings each day, meaning that there was only 24 hours to figure out each code before everything changed.
What Turing did to crack Enigma was to build a machine capable of doing logical calculations that would eliminate a vast number of the possible settings. This was called the bombe and built on earlier work by Polish cryptanalysists. Basically it could try different rotor settings in turn, and look for logical contradictions that would show the settings to be impossible. (For an analogy, think of doing a sudoku puzzle where you might postulate that a 6 goes in a box, but that would result in two 3s in another row which can’t happen, so therefore it can’t be a 6.) Such contradictions might include:
- Deciding that a letter was encoded as itself. This was impossible due to the way the signals through Enigma were sent round the reflector.
- Having an asymmetry in the plugboard. If the B is connected to the N, then the N must also be connected to the B.
- Having a letter in the plugboard connected to its neighbour. Operators were told not to do this.
- Having plugboard settings that were used the previous day. Operators had to change all the settings every day.
There were many more contradictions like these. There were also sloppy practices among the operators. They would often not set the rotors to a truly random initial position, but would use their names (e.g. “BOB”) or would simply turn the rotors a few places from the day before. There were also common phrases in messages, e.g. the word “ein” appeared in 90% of messages, and “Heil Hitler” also appeared often. Wikipedia has a fascinating description of the methods used to crack Enigma, as well as the methods the Germans used to make Enigma even harder to crack.
Nowadays, encryption of information is done using mathematical algorithms rather than mechanical machines. The most commonly used is called RSA, and relies on the difficulty of factorising large numbers into primes. Just as in Turing’s time, codebreakers rely on having a large amount of information to help them look for patterns, and on computers to do the decryption. The big difference between then and now, according to Tom Leinster, is that during the war the government was spying on the Nazis; today they are spying on us.
Earlier this year it was revealed that the UK’s Government Communications Headquarters (GCHQ) and the USA’s National Security Agency (NSA) had been systematically monitoring all of our emails, phonecalls, texts, web browsing and bank transactions. Their goal was to “collect all of the signals, all of the time”, regardless of whether or not anybody had done anything suspicious. And, just as in the film, the codebreakers often had complete autonomy over the information they collected, with even the highest in command being unaware of what they were doing.
Just as in the film, this codebreaking was only possible because of errors in the encryption of the data. The information leaked by Snowdon showed that the NSA had inserted a secret back door into the world’s most widely used cryptosystem, allowing it to break the encryption.
Alan Turing was a homosexual at a time when homosexuality was illegal, and his conviction and subsequent chemical castration were what led to his suicide in 1954. Today it is legal in the UK for two men to have sex, and even get married. This change in our law has come about because of campaigning and activism, but it is always dangerous to be an activist, speaking out for something that is considered against the law. It is easily argued that such campaigning is even more dangerous today, with the government carrying out mass surveillance of everyone in the population.
I shall end this post with the question asked by Tom Leinster in a piece he wrote for New Scientist in April: is it ethical for mathematicians to work for government intelligence agencies like GCHQ?
I look forward to hearing your comments!
Read The Imitation Game: Part 3 about Turing’s work in biology and pattern-formation.