You probably use encryption, in one form or another, every day. You might not know that you are, but you are. And my guess is that you don’t give it a second thought. Do you have a subscription based cable or satellite TV service? Guess what, some of that content will be encrypted. Do you connect to websites using https://? That’s more encryption. Ever created a .zip file with a password? You got it, that uses encryption.
I could go on and a list dozens of other examples of every day encryption, but I won’t. As for Android, it also supports encryption, not only for the web with https:// but also for your files and data. Android 6.0 Marshmallow used full disk encryption, whileAndroid 7.0 Nougat has added the option for per-file encryption. The idea is that if your phone should fall into the hands of unfriendlies, then your private data is secure.
So what is encryption? It is the process of taking plain data, including text, and converting it into an unreadable (by humans or computers) form. The encryption process is based on a key, the analogy here being a lock which needs a key, and only people with the key can unlock (decrypt) the data and put it back into its original form. This means that anyone who gets hold of your encrypted data can’t read it unless they have the key.
As the TomJericho character in the excellent film Enigma put it, “It turns plain-text messages into gobbledygook. At the other end is another machine, which translates the message back to the original text.” Encryption and decryption!
It all started with CaesarThe artof secret writing, what we would call encryption, has been around for at least 2500 years, however the most famous example from antiquity is that of the substitution cipher used by Julius Caesar to send messages to Cicero. A substitution cipher works like this, you start with the alphabet on one line and then add a second line with the alphabet shifted along a bit:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z X Y Z A B C D E F G H I J K L M N O P Q R S T U V WIf you want to encrypt the word “HELLO” then you take the first letter, H, and look at the letter below it, that gives you E. Then the E gives B and so on. The encrypted form of HELLO is EBIIL. To decrypt it you lookup E on the bottom row and see the H above it, then the B on the bottom to get the E above it and so on. Complete the process to get HELLO.
In this case the “key” is 3, because the alphabet has been shifted three to the right (you can also shift to the left instead). If you change to key to say 5, then you get this:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z V W X Y Z A B C D E F G H I J K L M N O P Q R S T UNow the encrypted version of HELLO would be CZGGJ. Very different to EBIIL. In this case the key is 5. Magic!
However there are some major problems with this form of encryption. First of all there are only 26 keys. You might have heard of people talking about 128-bit keys or 256-bit keys, well this is a 5 bit key (i.e. 26 in binary is 11010). So it wouldn’t take too long to try all 26 variations and see which one starts to produce understandable text.
Secondly, English (and other languages) has certain characteristics. For example, E is the most popular letter in English, so if you had a good chunk of text you could see which letter appears the most frequently and then guess that it is E. Shift the bottom alphabet to match E withthe most common character and you have probably cracked the code. Also there are only a few letters that can double up in English, like OO, LL, SS, EE and so on. Whenever you see a double like the II or GG (from the examples above) then you should try matching those on the alphabets first.
The combination of the small key and the fact that the same letter always encrypts to the same correspondingletter on the cipher alphabet means that this is very weak encryption. And today with computers doing the hard work, this is beyond weak!
More alphabets and unbreakable encryptionThe weaknesses of the Caesar substitutioncipher can be slightly alleviatedby using more than one shifted alphabet. The example below can be expanded to 26 shifted alphabets of which severalare used at once, but not all of them.
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Y Z A B C D E F G H I J K L M N O P Q R S T U V W X X Y Z A B C D E F G H I J K L M N O P Q R S T U V W W X Y Z A B C D E F G H I J K L M N O P Q R S T U V V W X Y Z A B C D E F G H I J K L M N O P Q R S T USo if we set the key to WVY that means we use the alphabet starting with W first, then the one starting with V and finally the one starting with Y. This is then repeated to encode the entire message. So HELLO would become DZJHJ. Notice that now the double L in HELLO isn’t encoded as the same character, it is now J and then H. Also, the first J in the encrypted text is the code for L while the second on is the code for O. So J now doesn’t always represent the same plain text letter.
A version of this idea, with 26 alphabets, is the basis of theVigenère cipher which was publishedin the16th century byBlaise de Vigenère. A similar idea was also described byGiovan Battista Bellaso in 1553. TheVigenère cipher remained unbreakable for 300 years until it was cracked by Charles Babbage and then byFriedrich Kasiski. The secret to breaking theVigenère cipher is understanding that ultimately the same words can be encoded using the same letters because the same alphabets are used again and again. So the word “AND” might be encoded different the first few times it appears, but ultimately it will be encoded using the same letters again. Repetition is generally the downfall of a cipher.
Repetition is the weakness in the Caesar cipher, the Vigenère and all the variants, but there is one way to use an alphabet cipher to create an unbreakable secret code without repetitions, it is called theone-time pad. The idea is that rather than using a shifted alphabet then a random sequence of letters are used. This sequence must be truly random and must be the same length as the message.
I S T H I S U N B R E A K A B L E P S O V Y V U B M W S P A H Q T DRather than doing a straight substitution this time we use addition, with a twist. Each letter of the alphabet is assigned a number, A is 0, B is 1, C is 2 and so on.I is the 9th letter of the alphabet, which means it has a value of 8. P (the letter below it on our one-time-cipher pad) 15. 8 + 15 = 25 which means X. The second letter of our message is S, which has the value 18. It just so happens that S is also the letter on our one-time pad (which isn’t an issue at all). 18 + 18 = 36. Now here is the twist, there is no 36th letter of the alphabet. So we perform whatis called a modulus operation. What that basically means is that we divided the result by 26 (the number of letters in the alphabet) and use the remainder. 36 / 26 = 1 remainder 10. The letter with the value of 10 is K. If you continue doing this the final encrypted message is:
X K H C G N O O N N W P K H R E HThe reason this code is unbreakable is that you only ever use the key (the random string) once. This means that anyone trying to decode the message has no reference point and there is no repetition. The next message to be sent will use a completely different random keyand so on.
The biggest problem with one-time pads, is getting the keys to the other party so that they can decrypt the messa