Ciphers: Hiding in Plain Sight
My last post talked about cryptography, its motivation and its primitive techniques. In this post, we shall learn about modern security techniques and how they keep us secure. We will touch upon the mathematical basis behind modern cryptography, talk about symmetric key cryptography and talk about a technique for exchanging secrets out in the open known as the Diffie-Hellman key exchange. To start off the discussion, I would like you to consider the following situation. Alice is a diplomat in a foreign country and has some sensitive information that she would like to communicate to Bob, her superior. The only means of communication available to Alice is a phone line that she and Bob knows is constantly being monitored and eves dropped on. How could Alice and Bob communicate a message in a secure manner starting from this state? We saw in the last post how, with some pre-decided shared secret (such as a codebook when using a substitution cipher), one could attempt to obscure a message and make it hard (but not too hard) to guess. More curiously, starting with no secret state between the two parties, is it possible to establish a secret codebook? In other words, sharing no secrets beforehand, is it possible to begin sharing a secret?
Ciphers: Early Techniques
We have all heard the word ‘encrypted’ used many times on television or in movies. It is usually implies some form of security or an obstacle that needs to be overcome by some button pushing by a man in a lab coat. Shows such as CSI and Law and Order use it, almost indiscriminately and often inaccurately. So what exactly is encryption and what does it mean to say that a message is encrypted? In this post, I shall describe the motivation behind cryptography, a number of early encryption schemes and the idea behind modern cryptography. In the second post in this series, I will describe more modern algorithms as they are used today in computers to protect your identity, privacy and security on the Internet.
Data Compression and Crowded Pigeons
Have you ever used a data compression program? WinZip, WinRAR, bzip2 and gzip are all common data compression software. If you have ever emailed large pictures, sent or received text files or pdfs, you may have run across these pieces of software. In general, they take a set of input files, and output a compressed file, which when uncompressed, produce the original files again. The quite amazing thing that these software applications do is to produce a compressed file whose size is less than the sum of the sizes of the input files. This is great! But have you wondered, what would happen if you tried to compress a compressed file? Would it get smaller? Could you not, therefore, repeatedly compress the same file over and over till it became insignificantly small? What is to stop this? Another question you may ask is, given a compressor, can you compress all files to yield a smaller file? Why do some files compress more than others?
2 Goats 1 Prize
Consider this scenario:
You have almost won at a game show! Your host, Monty, gives you one final task. He lifts the curtains to reveal three closed doors, labeled A, B, and C. He informs you that behind the doors, in some order, there are two goats and a grand prize. The doors are identical and there is no way to tell what is behind them. He asks you to pick a door, which you do, picking one at random (let’s say A). Monty turns around and, in a fit of generosity, tells you that he will help you out by opening one of the unselected doors that contains a goat (let’s say B). Having taken one of the doors out of the running, he now gives you a choice. “Would you like to switch your selection to C or stay with A?”, he asks.
As you stand there, what should you do to have the best chance of winning the grand prize? Is there a systematic way to find out?
Confirmatory Tests and False Positives
Consider this. You feel ill, are running a fever, and feel dizzy. You go to a nearby hospital and the attending physician suspects that you are suffering from a rare illness called Beetleguese fever. He orders a test to check his hypothesis which comes back positive for Beetleguese fever. Having seen this test result, you would expect the physician to start treating you immediately. Instead, he orders another test to confirm the diagnosis. Why? Isn’t that just a waste of time and money? Maybe the test was not accurate enough. You ask the physician how accurate the test is. He tells you that it is 99.999% accurate. “That’s great!”, you exclaim. The physician retorts, “It’s not good enough. We need to confirm it independently”. Why would he say that? How much more accurate can you get?
If Beetleguese fever is a very rare disease, then the physician is right. Let us assume that you live in a country with a population of 10 million people. At any given time, let us assume that 100 individuals in the population have Beetleguese fever (the physician mentioned that it was very rare). If a random person from this population was chosen and tested for Beetleguese fever using our 99.999% accurate test, and the test result came back as positive for the disease, the probability that the subject actually has Beetleguese fever is only around 50%! If you find this fact hard to believe, read on and I will show you exactly how we arrive at this number and why this is the case.