Featured Product
This Week in Quality Digest Live
Health Care Features
Stephanie Ojeda
How addressing customer concerns benefits the entire quality process
Michael King
Augmenting and empowering life-science professionals
Meg Sinclair
100% real, 100% anonymized, 100% scary
Kristi McDermott
Technology and what the future requires for patients and providers to thrive
Alonso Diaz
Consulting the FDA’s Case for Quality program

More Features

Health Care News
Recognized among early adopters as a leading innovation for the life sciences industry
Study of intelligent noise reduction in pediatric study
Streamlines annual regulatory review for life sciences
The company is also facilitating donations to the cause
Mass spectromic analysis from iotaSciences
Showcasing the latest in digital transformation for validation professionals in life sciences
An expansion of its medical-device cybersecurity solution as independent services to all health systems
Purchase combines goals and complementary capabilities
Better compliance, outbreak forecasting, and prediction of pathogens such as listeria or salmonella

More News

Davis Balestracci

Health Care

The Wisdom of David Kerridge—Part 2

Statistics in the real world aren't quite as tidy as those in a text book.

Published: Thursday, July 9, 2009 - 03:00

Click here to read part 1 of this series.

Analytic statistical methods are in very strong contrast with what is normally taught in most statistics textbooks, which describe the problem as one of “accepting” or “rejecting” hypotheses. In the real world of quality improvement, we must look for repeatability over many different populations. Walter Shewhart added the new concept of statistical control, which defines repeatability over time sampling from a process, rather than a population.

For example, the effectiveness of a drug may depend on the age of the patient, or previous treatment, or the stage of the disease. Ideally we want one treatment that works well in all foreseeable circumstances, but we may not be able to get it. Once we recognize that the aim of the study is to predict, we can see what range of possibilities are most important. We not only design studies to cover a wide range of circumstances, but to make the “inference gap” as small as possible.

By the inference gap we mean the gap between the circumstances under which the observations were collected and the circumstances in which the treatment will be used. This gap has to be bridged by assumptions, in this case, based on theoretical medical knowledge, about the importance of the differences.

Suppose that we compare two antibiotics in the treatment of an infection. We conclude that one did better in our tests. How does that help us? Well-planned and designed experiments are rarely possible in emergencies, so the gap may be quite large.

Suppose that we want to use an antibiotic in Africa, however, all our testing on the antibiotic was done in one hospital in New York in 2003. It's quite possible that the best antibiotic in New York is not the same as the best in a refugee camp in Zaire. In New York, the strains of bacteria may be different; and the problems of transport and storage truly are different. If the antibiotic is freshly made and stored in efficient refrigerators, it may be excellent. It may not work at all if transported to a camp with poor storage facilities.

And even if the same antibiotic works in both places, how long will it go on working? This will depend on how carefully it is used, and how quickly resistant strains of bacteria build up.

And then there are the sampling issues

Scenario 1: We often use random sampling in analytic studies, but it is not the same as that used in an enumerative study. For example, we may take a group of patients who attend a particular clinic and suffer from the same chronic condition; we then choose at random, or in some complicated way involving random numbers, who is to get which treatment. However, the resulting sample is not necessarily a random sample of the patients who will be treated in the future at that same clinic. Still less are they a random sample of the patients who will be treated in any other clinic.

In fact, the patients who will be treated in the future will depend on choices that we and others have not yet made. Those choices will depend on the results of the study we are currently doing and on studies by other people that may be carried out in the future.

Scenario 2: Suppose that we want to know which of two antibiotics is better in treating typhoid. We cannot take a random sample of all the people who will be treated in the future; there is no readily available “bead box” of people waiting to be sampled, because we don't know who will get typhoid in the future. One has no choice but to use the mathematics of random sampling; but this is a different kind of problem—sampling from an imaginary population. The famous statistician, R.A. Fisher, used the words: “A hypothetical infinite population.”

The practical difference, as Fisher saw it, is that we must not rely on what happens in any one experiment; we must repeat the experiment under as many different circumstances as we can. If the results under different circumstances are consistent, believe them. If they disagree, think again.

So with an analytic study, there are two distinct sources of uncertainty:

  • Uncertainty due to sampling, just as in an enumerative study. This can be expressed numerically by standard statistical theory.
  • Uncertainty due to the fact that we are predicting what will happen at some time in the future and to some group that is different from our original sample. This uncertainty is unknown and unknowable. We rarely know how the results we produce will be used, and so all we can do is to warn the potential user of the range of uncertainties which will affect different actions.


The latter uncertainty, especially in management circumstances, will usually be an order of magnitude greater than the uncertainty due to sampling.

People want tidy solutions and feel uncomfortable with the “unknown and unknowable.” Of course, we would rather be certain if we can, but it is very dangerous to pretend to be more certain than we are. The result, in most statistics courses, has been a theory in which the unmeasured uncertainty has just been ignored.

Aim and method—five examples

In looking at a potential improvement opportunity, “What is your aim?” is always the first question. Here are examples of different aims, calling for different methods.

Aim 1: Describe accurately the state of things at one point in time and place.

Method 1: Define precisely the population to be studied, and use very exact random sampling.

Aim 2: Discover problems and possibilities, to form a new theory. 

Method 2: Look for interesting groups, where new ideas will be obvious (using a common cause strategy to expose hidden, aggregated special causes). These may be focus groups, rather than random samples. The accuracy and rigor required in the first case is wasted. This assumes that the possibilities discovered will be tested by other means before making any prediction.

Aim 3: Predict the future, to test a general theory.

Method 3: Study extreme and atypical samples (special causes) with great rigor and accuracy.

Aim 4: Predict the future, to help management.

Method 4: Get samples as close as possible to the foreseeable range of circumstances in which the prediction will be used in practice.

Aim 5: Change the future, to make it more predictable.

Method 5: Use SPC to remove special causes, and experiment using the plan-do-study-act (PDSA) cycle to reduce common cause variation.


The first case is enumerative, all the rest are analytic. How many statistics textbooks make these obviously necessary distinctions?

As a dear statistician friend of mine said as we talked about the futility of teaching degrees of freedom (DOF): “I wish people were asking better questions about the problem they’re trying to understand or solve, the quality of the data they’re collecting and crunching, and what on earth they’re actually going to do with the results and their conclusions. In a well-meaning attempt not to turn away any statistical questions, my own painful attempts to explain DOF have only served to distract the people who are asking from what they really should be thinking about.… People think it’s important, but in the big scheme of things, there are far more important issues in data collection and interpretation.… I’d rather people understood that the quality of their data is far more important than the quantity of it.”

Can we please stop the legalized torture and waste that is passing for alleged statistical training?


About The Author

Davis Balestracci’s picture

Davis Balestracci

Davis Balestracci is a past chair of ASQ’s statistics division. He has synthesized W. Edwards Deming’s philosophy as Deming intended—as an approach to leadership—in the second edition of Data Sanity (Medical Group Management Association, 2015), with a foreword by Donald Berwick, M.D. Shipped free or as an ebook, Data Sanity offers a new way of thinking using a common organizational language based in process and understanding variation (data sanity), applied to everyday data and management. It also integrates Balestracci’s 20 years of studying organizational psychology into an “improvement as built in” approach as opposed to most current “quality as bolt-on” programs. Balestracci would love to wake up your conferences with his dynamic style and entertaining insights into the places where process, statistics, organizational culture, and quality meet.