Ciphers: Early Techniques
We have all heard the word ‘encrypted’ used many times on television or in movies. It is usually implies some form of security or an obstacle that needs to be overcome by some button pushing by a man in a lab coat. Shows such as CSI and Law and Order use it, almost indiscriminately and often inaccurately. So what exactly is encryption and what does it mean to say that a message is encrypted? In this post, I shall describe the motivation behind cryptography, a number of early encryption schemes and the idea behind modern cryptography. In the second post in this series, I will describe more modern algorithms as they are used today in computers to protect your identity, privacy and security on the Internet.
The primary motive behind cryptography is privacy, be it at the institutional or the individual level. In many circumstances, we desire to transmit privileged information that is only intended to be received and understood by one recipient. Depending on the privilege and secrecy of the message, we may not want to disclose to other parties a number of facts about the message such as what the contents of the message are, when the message was transmitted, to how many destinations it was transmitted to, who the source was or even, whether the transmission occurred or not. I shall not describe all the details involved in ensuring the above properties but we will explore a number of common scenarios.
Before we start discussing cryptography, we need to understand some of the vocabulary needed to discuss the subject. A scenario that requires cryptography involves a number of agents who operate either independently or co-operatively. Let us only consider simple scenarios that have agents in three groups – the good guys, the bad guys and the swiss. The goal is generally for one of the good guys (let’s call her Alice) to send a message to another good guy (let’s call him Bob) without interference. They are allowed to communicate on an agreed upon communication medium called a channel. The bad guys can interfere with the communication in a number of ways. They can be passive and attempt to collect information which may be useful later, or they may be active and try to block the communication or even try to forge messages between Alice and Bob. Let’s call the passive attacker Eve and the active attacker Malory. To make the communicated message private and secure, Alice encrypts her message (also called plain text) into the coded message. Bob reverses this process by decrypting the coded text back into plain text.
The goal of cryptography is the understand the techniques by which information can be transmitted in such a manner as to maximize the chances that the intended receivers are able to reproduce the message with relative ease while making it unfeasibly hard for adversaries to discover, recover, attack or modify the message or the communication channel. Cryptography is common place in the armed forces. One needs to transmit orders to troops and information about enemy troops and their movements without the enemy being privy to that information. How can one ensure the security of such a message? One way is to ensure that the enemy never gets his hand on the message. This is known as securing the communication channel. It works by preventing the enemy from ever being able to see the message and hence being able the understand what it was about. In olden days, when messages were propagated through parchment or word of mouth, securing the channel involved a strong-willed, trustworthy messenger and, possibly, an armed escort. In the modern day, it involves secure hardware and an understanding of cryptography.
If a trustworthy messenger cannot be found, or his safety cannot be assured, it may be better to communicate in a manner that will make it difficult for an eavesdropper to understand or decipher the message. An early technique in this category was the use of ciphers (hence the word de-cipher). The simplest of ciphers was the displacement-cipher or the rotation-cipher. To use this cypher, all the generals who ever want to communicate with each other meet in private and pick one number between 1 and 25. When one general needs to send a message to another, she takes the message (written in plain English) and shifts each letter forward by the number they had previously agreed to. As an example, if they all picked the number 5, a would get shifted to f, b to g and c to h and so on. Hence, the word cat would be written as hfy. Should the message be found by the enemy, it would look like utter gibberish. The general at the other end would know the secret number and would shift the letters back to understand the message. This method is known as a Caesar cipher – it was first used in ancient Rome. Unfortunately, this primitive method is not a very good one all. A resourceful enemy could easily extract the message by simply trying to shift the letters in said message by every number between 1 and 25. One of these shifted message will be legible in English. Because there are only 25 possibilities the enemy must look through, the message is not very secure it all. It is only mildly inconvenient to the adversary. This form of “try all solutions/guess and check” attack is known as a brute force attack. This method is extremely susceptible to it.
In a slightly more advanced system, instead of shifting every letter in a message by a single offset one could use a different offset for each letter. In other words, each general agrees on a codebook that is kept secret amongst them. The codebook maps each letter of the alphabet to another letter. Care is taken to make sure never to repeat a letter. When Gen. Alice wishes to send a message to Gen. Bob she takes her message, in plain English, and converts each letter into its target code letter. This message can then be deciphered by Gen. Bob by taking characters from the coded message and translating it backwards through the codebook. This sort of cipher is known as the substitution cipher. It makes it harder for the baddies to guess the answer. Instead of simply translating the message 26 times, the enemy has to consider mapping each letter to any other letter. If they attempted to brute force a solution it would take them over 10^26 attempts on average. Even at one attempt a second, the time taken would be orders of magnitude longer than the current age of the entire universe. Instead, the baddies came up with a slightly more ingenious way to break this cipher. The English language does not use all letters with equal frequency. The letter e is used far more often than q, x or z. Hence, one can look at the coded message and try to discern which letter is ‘e’ and then which one is ‘a’ and so on and so forth based on the frequency with which letters appear in the coded text. This type of attack is called a distribution attack, because it uses knowledge about the general characteristics of a message and rate at which different characters are present in it to attack a cipher.
In Part II – Modern Ciphers….