Featured Product
This Week in Quality Digest Live
Six Sigma Features
William A. Levinson
Quality and manufacturing professionals are in the best position to eradicate inflationary waste
Donald J. Wheeler
What does this ratio tell us?
Donald J. Wheeler
How you sample your process matters
Paul Laughlin
How to think differently about data usage
Donald J. Wheeler
The origin of the error function

More Features

Six Sigma News
How to use Minitab statistical functions to improve business processes
Sept. 28–29, 2022, at the MassMutual Center in Springfield, MA
Elsmar Cove is a leading forum for quality and standards compliance
Is the future of quality management actually business management?
Too often process enhancements occur in silos where there is little positive impact on the big picture
Collect measurements, visual defect information, simple Go/No-Go situations from any online device
Good quality is adding an average of 11 percent to organizations’ revenue growth
Floor symbols and decals create a SMART floor environment, adding visual organization to any environment
A guide for practitioners and managers

More News

Rip Stauffer

Six Sigma

The Importance of Understanding Conditional Probability

It helps to build a table

Published: Wednesday, June 6, 2018 - 12:03

A lot of people in my classes struggle with conditional probability. Don’t feel alone, though. A lot of people get this (and simple probability, for that matter) wrong. If you read Innumeracy by John Allen Paulos (Hill and Wang, 1989), or The Power of Logical Thinking by Marilyn vos Savant (St. Martin’s Griffin, 1997), you’ll see examples of how a misunderstanding or misuse of this has put innocent people in prison and ruined many careers. It’s one of the reasons I’m passionate about statistics, but it’s hard for me, too, because it’s not easy to work out in your head. I always have to build a table.

The best thing to do is to be completely process-driven; identify what’s given, then follow the process and the formulas religiously. After a while, you can start to see it intuitively, but it does take a while.

In my MBA stats class, one of the ones that always stumped the students was a conditional problem:

“Pregnancy tests, like almost all health tests, do not yield results that are 100-percent accurate. In clinical trials of a blood test for pregnancy, the results shown in the accompanying table were obtained for the Abbot blood test (based on data from ‘Specificity and Detection Limit of Ten Pregnancy Tests’ by Tiitinen and Stenman, Scandinavian Journal of Clinical Laboratory Investigation, 53, Supplement 216). Other tests are more reliable than the test with results given [in figure 1].

Positive
Result

Negative
Result

Subject is pregnant

80

5

Subject is not pregnant

 3

11

Figure 1

“1. Based on the results in the table, what is the probability of a woman being pregnant if the test indicates a negative result?”

“2. Based on the results in the table, what is the probability of a false positive; that is, what is the probability of getting a positive result if the woman is not actually pregnant?”

Everyone would just try to look at it as though there were no conditions... they would say, 5/80 for question 1, and 3/80 for question 2.

The first question, though, is asking, “What is the chance of being pregnant, given a negative result?” There were 16 negative results, and of those, five were pregnant. So the answer is 5/16, or 31.25 percent. For the second question, it’s, “What is the probability of a positive, given that the woman is not pregnant?” In this case, there are 14 nonpregnant women, and three of those got a positive result. So that’s about 21.42 percent.

These numbers, and this idea, are really important. Some statisticians make their living explaining these concepts to juries. People get fired or arrested because of false positives on urinalysis and other tests, because there is a general impression that they are far more reliable than they actually are.

It’s all about what you are given, and how you define things. Let’s look at a different example. In the military, people are given random drug screenings. The test is “certified 99-percent accurate.” I was always told that this means that if you do drugs, and you’re tested, it will catch you 99 percent of the time.

We think, “logically,” that this means there is only a 1-percent false negative rate... that the fact that someone who does drugs doesn’t get caught 1 percent of the time indicates that 1-percent false positive rate. Worse, we assume that if the “false negative rate” is only 1 percent, the false positive rate must also be 1 percent…it’s just common sense, right?

But “common sense” isn’t... it’s neither common nor truly sensical. Look at it this way... suppose we test 100,000 service members. Suppose further that 0.1 percent or one in a thousand service members actually do drugs. We might get this table shown in figure 2.

 

Do Drugs

Don’t Do Drugs

Test Positive

99

999

Test Negative

1

98,901

Figure 2

Tables like this are informative, but they don’t tell the whole story. You can see from this that the company is technically correct... at least in this case, of 100 people who did drugs, 99 were caught and one was not. But a false positive rate and a false negative rate are made up of more. To get to the whole story, it’s also good to do the marginals, or row and column totals as shown in figure 3.

 

Do Drugs

Don’t Do Drugs

 

Test Positive

99

999

1,098

Test Negative

1

98,901

98,902

Totals

100

99,900

 

Figure 3

Numbers like these, the numbers of people tested, are very important. This helps us figure out our givens. The false negative rate is not the number of people who did drugs and tested negative. It’s the number out of all the people who tested negative who actually did drugs. In this case, the false negative rate is much better than advertised... it’s 1/98,902, or 0.00001, about one in 10,000 who do drugs and get tested get away with it.

The consequences, though, are on the false positive side... this is where people get turned away for employment or get fired. In the case of the military, a lot of people end up in a lot of trouble with the random urinalysis program. While we want to be cautious, and we don’t want a lot of druggies flying or controlling aircraft or tanks or other deadly weapons, we should also be concerned that we might be ruining careers unnecessarily. If we look at the table, the “common sense” interpretation of the false positive rate would be 999/100000, or 0.999 percent, very close to the 1 percent that we assumed initially. But, as astounding as it may seem, considering the number of people that are convicted each year because of this assumption, this is entirely incorrect!

The actual false positive rate consists of the number of people incorrectly identified as drug users, or the number of nondrug users out of the total number of positives. In this case, that’s 999 out of 1,098, or 90.98 percent! In other words, your chance of actually being a drug user, given a positive result on this “99-percent accurate” test, is only 9.02 percent!

Yes, it’s tricky. No, it’s not intuitive. But it’s important. It touches lives. Juries, lab technicians, doctors and nurses, lawyers, employers, employees, and patients who don’t understand this put either themselves or others in peril every day.

Discuss

About The Author

Rip Stauffer’s picture

Rip Stauffer

Rip Stauffer uses his extensive experience in total quality and Six Sigma to educate and counsel at all career levels with specific experience in government, manufacturing, medical devices, financial services, and healthcare organizations. A senior consultant at MSI, and CEO of Woodside Quality LLC, Stauffer is an ASQ senior member and statistics division member, a certified quality engineer, a manager of quality and organizational excellence, and a Six Sigma Black Belt and Master Black Belt. He also is an adjunct faculty member at Walden University, teaching graduate and undergraduate business statistics courses and international business courses.

Comments

"This is one of the best

"This is one of the best short articles I hvae ever read on this topic.  Good examples and clear writing.  I will use it in my classes.  Thanks.  RCL

PS  Note that probability is misspelled in the title.  It is printed as 'probablity.'

Thanks

Welll... we did say that probablity... probabababilty... probability was a problem