A5/1 TMTO attack description

TL;DR You can go through the A5/1 keyspace and save some “distinguished points”. When you want to recover the key, you reconstruct the keyspace from the nearest distinguished point.
This document contains theoretical description of the attack. There is another one that describes the actual implementation.

Cipher basics

Symmetric cipher takes plaintext and key and creates ciphertext. The other party of the communication takes the ciphertext and the very same key and uses the cipher to decrypt it to plaintext.

Some symmetric ciphers are block ciphers. They operate on blocks of fixed size. They take block of plaintext (usually 64-128 bits) and create block of ciphertext (of the same size). Examples of block ciphers are AES, Blowfish, DES.

          plaintext
              |
              v
key ---->  cipher
              |
              v
          ciphertext

Other case of symmetric ciphers are stream ciphers. The key is used to initialize a pseudorandom number generator (PRNG) and the PRNG generates a sequence called keystream. The keystream is then used as one-time-pad, it is XORed with plaintext. The other party of communication initializes its PRNG with the same key, generates the very same keystream and XORs it with ciphertext. The plaintext appears. Examples of stream ciphers are RC-4 and A5/1

                   plaintext
                      XOR
key ---> PRNG ---> keystream
                       |
                       v
                   ciphertext

(note: block ciphers can be easily modified to stream ciphers)

Stream ciphers are popular for telecommunications. Flipping one bit in block cipher ciphertext destroys the whole block, whereas flipping one bit in stream cipher destroys only this one bit. Hence communication is possible even with weak signal when occasional problems in reception occur.

It is very important not to use one key multiple times, as the one-time-pad property would be destroyed. In GSM, the session key (called Kc, 64-bit) is concatenated with frame number (22-bit) and the resulting 86 bits are used for A5/1 PRNG initialization.

The 22-bit frame number rolls over every 3.75 hours. This is longer than the interval of standard rekeying. Additionally, the attacker will have to match the exact frame which is difficult in network with lots of latency. The attack on this is not practical.

A5/1 basics

Have a look at the A5/1 cipher (source: Wikipedia)

It consists of 3 shift registers of 64 bits total and several XOR gates. When you are clocking the registers, they spit out pseudo-random bits. These are used as keystream.

The cipher has a “secret state” (or “internal state”) defined by values in the shift registers. Once you recover that secret state, you can load it to your cipher and generate the keystream from that point on (and decrypt the rest of the current burst - A5/1 is reinitialized on each burst in GSM). Moreover, you can backclock the cipher by running the shift registers widdershins and recover previous keystream and even the key itself. The solutions for counter-clockwise running are sometimes ambiguous (e.g., sometimes you cannot determine the input to a XOR gate given the result), however, you usually get a few dozens of key candidates, which you can just try. Backclocking is implemented in find_kc utility from kraken-utils.

Recovering cipher secret state is what we want to do.

Naive approach

Notice that the internal state is only 64 bits. It is possible to run naive bruteforce with modern hardware. It will take about 100 years with our A5/1 implementation on an AMD HD 7970 GPU and is infinitely parallelizable. It is impractical, as on standard GSM cell you have several communications each minute.

Precomputed table

You can compute the table mapping internal state to keystream.

Internal state  	Keystream
0x0000000000000000	0xabcdef12345678
0x0000000000000001	0x54632221afed03
0x0000000000000002	0x456dcd562b980e
...
0xffffffffffffffff	0x002accd4dc51df

Once you observe a certain keystream in the air, you can just look up in the table the corresponding internal state. The table can be sorted by the keystream resulting in very efficient (logarithmic) lookups.

However, such table is big. Naively it is 2^64 states * 8 bytes per state = 128 Exabytes.

The effective keyspace is only about 61 bits (because of the solution ambiguity, several states in the cipher collapse to one resulting state), though it is not practical to store Exabytes if you are not a NSA.

Keystream extraction

Oh by the way how do you get that keystream when you know only ciphertext? Well ciphertext = keystream XOR plaintext and hence keystream = ciphertext XOR plaintext. You can try to guess the plaintext and it seems to be rather easy. In GSM it is not possible for a BTS to stop transmission as the phones will lose synchronization. Then even when it has nothing to say, it transmits frames full of padding. So let's XOR the padding (plaintext) with the captured ciphertext to obtain keystream.

One countermeasure to this attack for the operators is to randomize the padding (and other deterministic data, like list of neighboring cells). This faces some technical difficulties in the protocol which are out of scope of this description. Additionally, plaintext recovery heuristics can really improve this attack. For example napalmex from brmlab's GSM stack contains an efficient implementation of this resulting in almost 100% chance of key recovery on insecure networks (like O2 or Vodafone) and 40-50% success rate even on secured networks like T-Mobile.

See Efficient plaintext guessing HOWTO

Table optimization

Let's try to compress the naive lookup table by creating chains. We take a starting secret state, use it as an A5/1 input, and we will not save the result, but use it as another A5/1 input. And then again until we get into a distinguished point (DP). DP can be defined for example by having 12 least significant bits zero. This gives the average chain length of 2^12 = 4096 and the compression ratio 1:4096 (it is worse, read on).

Once you get a keystream to crack, you use it as an input to this chain generator and run it until you reach a distinguished point. Then you look up in your table corresponding starting point and generate the chain again. The chainlink immediately preceding your keystream is the secret value you are interested in.

We have compressed the table chainlength times at a price of worsening the lookup time in a sorted table from log(N) to log(N)+chainlength. This is called time-memory tradeoff.

Merges

Unfortunately, given the cipher ambiguity again, sometimes several chains merge into one. You have then several data multiple times or you don't have some “branches” of the keyspace at all. This is an especially important problem if you lengthen the chain to achieve higher compression ratio.

Merges can be reduced by splitting long chains using colors (this is why these tables are called rainbow tables). We have to change our chain generator by introducing a reduction function. So far it was

Now it is

The reduction function in the easiest form is XORing by a predefined constant. Different colors are assigned different constants. Hence, parts of chains are XORed by different constants. If a merge appears in distinct colors, it is broken by different constants in the next round.

Table coverage for GSM

Rainbow tables can't cover 100% of the keyspace as you will never go through all states when generating (random) chains. But we don't care – in GSM, every frame consists of 4 bursts, 114 bits each. The sample we are trying to crack is 64bits long and hence you have 51 samples per burst or 204 samples per frame. To get a 50% chance to crack a given single frame, you need to cover less than 0.5% of the keyspace! And you usually get more than one frame from a GSM communication.

Thus, we have reduced the size of the naive table:

3TB drive costs $150 as of August 2014.

Please go read the original Karsten Nohl's paper and detailed analysis for a more accurate estimate.

Links: