# Confirmatory Tests and False Positives

Consider this. You feel ill, are running a fever, and feel dizzy. You go to a nearby hospital and the attending physician suspects that you are suffering from a rare illness called Beetleguese fever. He orders a test to check his hypothesis which comes back positive for Beetleguese fever^{[1]}. Having seen this test result, you would expect the physician to start treating you immediately. Instead, he orders another test to confirm the diagnosis. Why? Isn’t that just a waste of time and money? Maybe the test was not accurate enough. You ask the physician how accurate the test is. He tells you that it is 99.999% accurate. “That’s great!”, you exclaim. The physician retorts, “It’s not good enough. We need to confirm it independently”. Why would he say that? How much more accurate can you get?

If Beetleguese fever is a very rare disease, then the physician is right. Let us assume that you live in a country with a population of 10 million people. At any given time, let us assume that 100 individuals in the population have Beetleguese fever (the physician mentioned that it was very rare). If a random person from this population was chosen and tested for Beetleguese fever using our 99.999% accurate test, and the test result came back as positive for the disease, the probability that the subject actually has Beetleguese fever is only around 50%! If you find this fact hard to believe, read on and I will show you exactly how we arrive at this number and why this is the case.

How would one go about analyzing a problem like this? The first step is to formulate the question in a precise manner. Next, specify all of the assumptions involved. With these in place work your way logically, in a ruthless, step-by-step manner towards the answer to your question. Let us formulate the question. **Given that a person randomly chosen from our population has tested positive for Beetleguese fever, what is the probability that he actually has the fever?** Our assumptions are:

- The population has 10 million individuals.
- 100 individuals in this population have Beetleguese fever.
- Our test is 99.999% accurate,

I need the clarify the last bullet. When I say ‘accurate’, I mean that if you have the dreaded fever, the test will come back positive 99.999% of the time. I also mean that if you do not have the fever, the test will come back negative 99.999% of the time. This does mean that there is a small (0.001%) chance that you will show up positive even if you are not sick. This is called a false positive.

Now, let us ask ourselves – what does the question mean? If I simply asked you “What is the probability that a random person in this population is sick with Beetleguese fever?”, you would say that it is the ratio of the number of sick people and total number of people in the population. This works out to 100 divided by 10 million, which is 0.00001 (or 0.001%). Similarly, **the probability that a person has the disease, given that he has tested positive, is the ratio between the number of sick people who would have tested positive and the total number of people who would have tested positive**.

How many people in total would have tested positive (if we tested everyone)? We can break this question into two parts – how many sick people would have tested positive and how many healthy people would have tested positive. Now, our test is 99.999% accurate in detecting sick people. Hence, basically all 100 sick people would end up testing positive. Yay! Our test works! How many healthy people would have tested positive? As mentioned earlier, the test incorrectly returns a positive result on healthy subjects 0.001% of the time. Given 9,999,900 people in the population who do not have Beetleguese fever, the test would incorrectly flag about 100 people (0.001% of 9,999,900 is 99.999 which is roughly 100). Hence, the total number of people who would be flagged (if everyone was tested) is 200. Therefore, the ratio of sick people who have tested positive and the total number of people who have tested positive is 100/200, which is 0.5 (or 50%). With this, we can now answer our question. **Given that a person randomly chosen from our population tests positive for Beetleguese fever, the probability that he actually has it is 50%.**

The above result is quite an interesting, non-intuitive result. In the real world, tests are rarely 99.999% accurate. A lot of common tests (such as HIV tests, drug tests, pregnancy tests) are 99.9% accurate at best. You can quickly see how a single positive result cannot always be trusted. This is the reason why physicians do confirmatory tests. It can be shown (using similar probabilistic arguments) that if a person independently tests positive twice, there is a negligible chance that it was due to a false positive. And this is why it is not a waste of time to confirm a diagnosis before treatment. After all, you don’t want to get treated for the wrong disease due to a 50-50 chance now, do you?

### Further Reading

If you would like to learn the details on how one can compute probabilities like this in general, read up:

^{[1]} When a test returns positive, the test is telling you that you have the disease. When it comes back negative, it is telling you that you do not have the disease.