Tom Pyzdek  |  05/03/2009

Statistics: The Good, the Bad, and the Ugly

There are lots of ways to look at numbers.

The quality and process improvement professions tend to rely heavily on statistical information. The very science of quality control can be said to have begun with Walter A. Shewhart’s development of the control chart and discovery of the concepts of special cause and common cause variation. But few would argue with the statement that there is a downside, and a dark side, to statistics. I hereby present a few examples of good, bad, and ugly statistical usage.

First the “good.” Statistical texts describe the advantages of using statistical methods at great length; I’ve contributed a page or two myself. Statistics can force us to look at data and facts, rather than relying on opinions and letting strong personalities force their beliefs on others. Statistical thinking uses data to separate variation from special causes and variation from common causes, thereby aiding decision making and learning. Statistics help us test our beliefs and to learn from experience. Statistics are the bridge between raw data and knowledge and understanding. They provide the means by which we can test our theoretical models of reality and learn from them. When statistical analysis fails to confirm our initial hypothesis, we are forced to reevaluate the hypothesis, which leads to improved understanding. Even if statistical analysis confirms our beliefs, we gain insight through the rigor it provides.

Statistical analysis is especially useful when we are faced with complex situations that challenge human understanding. Even two-dimensional problems can become too complex to understand completely without statistical tools. Response surface analysis tells us about optimum process settings, and explains what will happen if the settings are not precisely controlled. Problems of higher dimension are often beyond us unless we break out our statistical tool kit to study the situation. For example, it helps the call center manager understand how contact resolution, waiting time, agent professionalism, and the phone menu combine to create customers who return to buy more, or customers who abandon you and tell their friends to stay away.

Lets now move onto the statistical “bad.” Statistics are numbers. They are an abstraction of reality and, because of this, they are not the reality itself. People often forget this and begin to think of customers and employees as revenue or cost sources, complaints, or something other than complete human beings. In this sense, anything that uses numbers represents a potential barrier to understanding the reality being studied.

Statistics are worse than raw numbers. They reduce numbers into an even smaller quantity of numbers. A mean is a single number that may represent thousands of individual measurements. There is no possible way that statistics can fail to lose some of the information contained in the original measurements.

Statistics trap us into analysis paralysis. No amount of statistical analysis can ever produce certainty. Some people have real problems dealing with this. Instead of accepting the uncertainty and acting anyway, they gather more data or apply more analysis to the data they already have.

Statistics are confusing. Even technically adept scientists and engineers sometimes have difficulty. For example, try explaining what it means to have a range chart that shows statistical control to a layperson. “The process variability is consistent” doesn’t immediately make sense to most people. More complicated methods require intensive study and a good deal of hands-on experience to understand.

Statistics are often used without graphics. The first three rules of data analysis may be plot the data, plot the data, and plot the data, but take a look at nearly any scientific paper, and you’re likely to find the results of statistical analysis presented in tables and words rather than in charts. Presenting the results of statistical analysis in numbers rather than in pictures is bad practice.

Finally we come to the “ugly.” “There are three kinds of lies: lies, damned lies, and statistics.” The statement, attributed to Benjamin Disraeli, refers to the persuasive power of numbers, the use of statistics to bolster weak arguments, and the tendency of people to disparage statistics that do not support their positions. The deliberate misuse of statistics is an ugly fact of life. One can pick up a newspaper on any given day and find ugly statistical abuse. It goes beyond attempting to deceive others. People have a tendency to overlook or ignore statistics that contradict their own beliefs. Statistics are often used by data bullies to pummel their opposition.

Send me examples of good, bad, and ugly statistical usage. I’d really like to hear about them.

Discuss

About The Author

Tom Pyzdek’s picture

Tom Pyzdek

Thomas Pyzdek’s career in business process improvement spans more than 50 years. He is the author more than 50 copyrighted works including The Six Sigma Handbook (McGraw-Hill, 2003). Through the Pyzdek Institute, he provides online certification and training in Six Sigma and Lean.