Featured Product
This Week in Quality Digest Live
Six Sigma Features
Mark Rosenthal
The intersection between Toyota kata and VSM
Scott A. Hindle
Part 7 of our series on statistical process control in the digital era
Adam Grant
Wharton’s Adam Grant discusses unlocking hidden potential
Scott A. Hindle
Part 6 of our series on SPC in a digital era
Douglas C. Fair
Part 5 of our series on statistical process control in the digital era

More Features

Six Sigma News
Helps managers integrate statistical insights into daily operations
How to use Minitab statistical functions to improve business processes
Sept. 28–29, 2022, at the MassMutual Center in Springfield, MA
Elsmar Cove is a leading forum for quality and standards compliance
Is the future of quality management actually business management?
Too often process enhancements occur in silos where there is little positive impact on the big picture
Collect measurements, visual defect information, simple Go/No-Go situations from any online device
Good quality is adding an average of 11 percent to organizations’ revenue growth

More News

Donald J. Wheeler

Six Sigma

The Leptokurtophobia Pandemic

What are the symptoms?

Published: Monday, December 4, 2023 - 12:03

Fourteen years ago, I published “Do You Have Leptokurtophobia?” Based on the reaction to that column, the message was needed. In this column, I would like to explain the symptoms of leptokurtophobia and the cure for this pandemic affliction.

Leptokurtosis is a Greek word that literally means “thin mound.” It was used to describe those probability models that, near the mean, have a more rounded (or peaked) probability density function than that of a normal distribution. Due to the mathematics involved, a leptokurtic probability model has more area near the mean than a normal model, and tails that are more attenuated than those of a normal distribution.

The first part of this characterization means that leptokurtic models will always have more than 90% within 1.65 standard deviations of the mean and less than 10% outside this interval. The second part means that while leptokurtic models will have less area in the outer tails than a normal model, a few parts per thousand (but never more than two dozen parts per thousand) may be found more than three standard deviations away from the mean.

Leptokurtophobia is the fear of leptokurtosis. This fear can be traced back to the surge in SPC training during the 1980s. Before this surge, only two universities in the United States were teaching SPC, and only a handful of instructors had any experience with it. As a result of the surge, many of the SPC instructors of the 1980s were neophytes, and many things that were taught at that time can only be classified as superstitious nonsense. One of these erroneous ideas was that you must have “normally distributed data” before you can put your data on a process behavior chart (also known as a control chart). Over the years, this simple but incorrect idea has grown and mutated into a prohibition of doing any statistical analysis without first testing the data for normality or defining a reference probability model for the data.

Therefore, you may have leptokurtophobia if you have an irrational fear of using nonnormal data in your analysis. Symptoms include asking if your data are normally distributed; transforming your data to make them more “mound-shaped”; or fitting a probability model to your data as the first step in your analysis.

This phobia was originally held in check by the complexity of the remedies, such as performing a nonlinear transformation or computing a lack-of-fit statistic. However, due to the availability of software that will perform these complex operations, today we find leptokurtophobia to be truly pandemic, with outbreaks occurring around the world. People are fitting probability models and transforming data with a few keystrokes, and as a result they are unknowingly suffering undesirable side effects. Insidiously, while these side effects have few symptoms they tend to completely undermine your analysis and predictions.

Let’s begin with the problem of fitting a probability model to your data. Figure 1 shows a histogram of the number of major hurricanes per year in the North Atlantic for 1940 through 2007. These 68 counts have an average of 2.59. Using this value as the mean value for a Poisson distribution, some lack-of-fit tests will fail to find any detectable lack of fit. Therefore, we might well conclude that a Poisson probability model with a mean of 2.59 is a reasonable model to use. From this we might then characterize the likelihood of various numbers of major hurricanes in a given year. Specifically, the probability of getting seven or more major hurricanes in a single year is found to be 0.017. Thus, in 68 years we should expect to find about one year with seven or more major hurricanes.

Figure 1: North Atlantic major hurricanes

However, NOAA researchers think that these data represent two different weather patterns. They call the change between these patterns the “multidecadal tropical oscillation.” They break this time period of 1940 to 2007 into four segments. In the time period used here, the era of lower activity includes 1940 to 1947 and 1970 to 1994. The era of higher activity includes 1948 to 1969 and 1995 to 2007. The histograms for these two eras are seen in Figure 2.

Figure 2: North Atlantic major hurricanes

During the era of low activity, the average number of major hurricanes per year was 1.58. During the era of high activity, this average more than doubled to 3.54 per year. So, which years would you say are characterized by the average of 2.59 major hurricanes per year? Clearly, this average does not apply to the era of low activity; neither does it characterize the era of high activity. Although your model based on Figure 1 predicts one year with seven or more major hurricanes, the data show three years with seven or eight major hurricanes.

Whenever you fit a model to your data, you are making a strong assumption that your data are homogeneous. If they are not homogeneous, all of your statistics, all of your models, and all of your predictions are going to be wrong.

Well, if fitting a probability model isn’t the answer, what about transforming the data?

When you transform the data, you are reshaping it to fit your preconceived notions. This is always a dangerous thing to do. Figure 3 shows the histogram of 141 hot metal transit times. These values are the times (to the nearest 5 minutes) between the call alerting the steel furnace that a load of hot metal is on the way and the actual arrival time of that load at the steel furnace ladle house. The average delivery time is 60 minutes. The standard deviation is 30 minutes. The skewness is 1.70, and the kurtosis is 6.0. (Anything above 3.0 is leptokurtic.)

Figure 3: Hot metal transit times

Some software packages, and more than a few data analysts, would suggest a logarithmic transformation for these data. Taking the natural logarithm of each of these transit times results in the histogram in Figure 4. There, the horizontal scales show both the original and the transformed values. The logarithmic transformation has spaced out the values on the left and has crowded the values on the right together so that the overall shape of the histogram is much more “mound shaped” than before. But is this an improvement?

Now the “distance” from 20 minutes to 25 minutes is about the same size as the “distance” from 140 minutes to 180 minutes. How are you going to explain to the steel furnace superintendent that a 5-minute delay on the left is equivalent to a 40-minute delay on the right? While the original histogram clearly showed two, and possibly three, humps, the transformed histogram blurs this important feature of the data.

Figure 4: Logarithms of the hot metal transit times

By itself, this distortion of the data by nonlinear transformations should be sufficient to make you want to avoid transformations. But wait—there’s more.

One of the major reasons for analyzing data is to detect signals buried within those data. And when we go looking for signals, the premier technique will be the process behavior chart. Figure 5 shows the X chart for the original hot metal transit times. Eleven of the 141 transit times are above the upper natural process limit, confirming the impression given by the histogram that these data come from a mixture of at least two different processes. Even after the steel furnace gets the phone call, they still do not have any idea about when the hot metal will arrive in the ladle house. It could be as soon as 20 minutes, or it might be three hours.

Figure 5: X chart for the hot metal transit times

However, if we use a nonlinear transform on the data prior to placing them on a process behavior chart, we end up with the X chart shown in Figure 6. There we find no points outside the limits!

Figure 6: X chart for the logarithms of the hot metal transit times

Clearly, the logarithmic transformation has obliterated the signals. Of what use is a transformation that changes the message contained within the data? The transformation of the data to achieve statistical properties is simply a complex way of distorting both the data and the truth.

The results shown here are typical of what happens with nonlinear transformations of the original data. These transformations hide the signals contained within the data simply because they are based on computations that presume there are no signals within the data.

So, what should be the first question of data analysis? Should you try to accommodate the shape of the histogram by fitting a probability model? Should you seek to reshape the histogram by using some nonlinear transformation? Or should you check the data for evidence of a lack of homogeneity? Since a lack of homogeneity will undermine the fitting of a probability model, and since it will invalidate the rationale for the transformation of the data, it is imperative to begin by checking for possible nonhomogeneity.

So, how can we determine when a data set is homogeneous? That is what the process behavior chart was created to do! This is why it’s essential to begin any analysis by organizing your data in a logical manner and placing them on a process behavior chart. If you do not have the requisite homogeneity, anything else you might do will be flawed.

When you fit a probability model to your data, you are making a strong assumption that the data are homogeneous. If they are not homogeneous, then your model, your analysis, and your predictions will all be wrong. When you transform the data to achieve statistical properties, you deceive both yourself and everyone else who isn’t sophisticated enough to catch you in your deception. When you check your data for normality prior to placing them on a process behavior chart, you are practicing statistical voodoo.

Whenever teachers lack understanding, superstitious nonsense is inevitable. Until you learn to separate myth from fact, you will be fair game for those who were taught the nonsense. And you may unknowingly end up being a victim of the leptokurtophobia pandemic.


About The Author

Donald J. Wheeler’s picture

Donald J. Wheeler

Dr. Wheeler is a fellow of both the American Statistical Association and the American Society for Quality who has taught more than 1,000 seminars in 17 countries on six continents. He welcomes your questions; you can contact him at djwheeler@spcpress.com.



Normal distribution its absence and cause for thought

Another wonderful paper by Don, thank you.  

Control charts adhering to normality is something I did not give much thought to, coming from the school that I do not want to transform my data. 

Historically I can see if a process changes based on the eight (or seven) locations that are either below or above the centerline, and that "I have a decision to make".  The pattern rule, does it look "nonrandom" is also quite helpful; occasionally, I'm not fortunate enough to get eight points from a single process; instead, I may receive two points in every ten batches.  This low number of batches is still sufficient enough to skew the histogram and produce control limits that are not that effective, especially when lots of data is reviewed in one go. 

It comes down to looking closely at the data and importantly using rationale subgrouping stratergies that Don has previously written about extensively.  This paper has gently reminded me that in life noramlity is not that constant and that if normal distributions are not present and my control limits do not look that effective, I should perhaps look at the data a bit more carefully, instead of blindly following software generated control limits. Cheers Ian

Whenever teachers lack understanding......

"Whenever teachers lack understanding, superstitious nonsense is inevitable."

Truer words have never been said! Just yesterday, my daughter sent me her three-year-old son's progress report from preschool. She is saddened by the fact the label applied to her son is the same label that was pinned on her years ago! 

"He is quiet and we work with him to promote verbal interaction."

The ratings are Satisfactory, Progressing, or Needs Improvement and there are 52 categories kids are judged on. The big question is can we fix the child's quietness or is it "normal." Superstition believes we can fix it. Dr. Deming believed in abolishing grades and gold stars in school. He noted, "Judging people does not help them." As a professor, Dr. Deming graded himself. Where was he failing? How could he improve his teaching? 

Sorry if a little off-topic. 

Thank you for another great article!


Not a phobia

Excellent as always.  However, while the ignorance of Dr Shewhart's brilliance today is greater than ever, I feel it is not a "phobia".  The ignorance of the fact that Process Behavior Charts work on any data, without data torture, is promulgated by perpetrators of the Six Sigma Scam.  Poppycock is promoted for profit. 

For example, how could ASQ toss away $35,000 per gullible Six Sigma victim by making Quality as simple as it should be?  Dr Wheeler has proven that a simple XmR chart, with a single rule (points outside limits), is all you need for any sort of variable or count data.  It is too simple to be profitable for con men.


We jump to solutions before we attempt to understand. The first true statement is that, given no modification to the process, with all variables remaining "as is," is likely to take between 25 minutes and 2 hours. If that range is unacceptable, the next step is NOT to modify the data, it is to detail the process.

There are obviously process variables that are not being controlled adequately. Performing a quick outlier test (Q1-IQ*1.5, Q3+IQ*1.5) shows any instance less than 2.5 minutes or greater than 102.5 minutes can be considered outliers. There is a cluster of data around the 102.5-minute outlier, indicating that is a separate, smaller cluster from the data on the left of the chart. Defining differences in process from the outlying cluster and the main cluster can begin to narrow the window further. It isn’t necessary that the data all be homogenous, simply that it be collected consistently. Only one the PROCESS is homogenous, will the data fall in line.