Featured Product
This Week in Quality Digest Live
Statistics Features
Douglas C. Fair
Part 3 of our series on SPC in a digital era
Scott A. Hindle
Part 2 of our series on SPC in a digital era
Donald J. Wheeler
Part 2: By trying to do better, we can make things worse
Douglas C. Fair
Introducing our series on SPC in a digital era
Donald J. Wheeler
Part 1: Process-hyphen-control illustrated

More Features

Statistics News
How to use Minitab statistical functions to improve business processes
New capability delivers deeper productivity insights to help manufacturers meet labor challenges
Day and a half workshop to learn, retain, and transfer GD&T knowledge across an organization
Elsmar Cove is a leading forum for quality and standards compliance
InfinityQS’ quality solutions have helped cold food and beverage manufacturers around the world optimize quality and safety
User friendly graphical user interface makes the R-based statistical engine easily accessible to anyone
Collect measurements, visual defect information, simple Go/No-Go situations from any online device
Good quality is adding an average of 11 percent to organizations’ revenue growth
Ability to subscribe with single-user minimum, floating license, and no long-term commitment

More News

Barbara A. Cleary

Statistics

Driverless Cars, Yes. Driverless Analysis... ?

Accurate predictions demand more than a chart

Published: Tuesday, August 15, 2017 - 11:03

If you get off the highway and take an alternate route when traffic slows to one lane, you are making a prediction. Likewise, if you decide to invite someone to dinner, that too is a prediction. The scientific method? Predictive in nature. Every time you make a decision, you are making a prediction of an outcome, and choosing one over another based on this prediction.

Prediction skills become second nature because of this daily application. These predictions may not be based on data or evidence, but involve some subjective guess about a preferred outcome. In the case of choosing a traffic route or a dinner date, it’s clear that not much data are involved. The decision involves subjective interpretations, intuitive hunches, and guesses about potential outcomes.

Will data analysis really enhance prediction accuracy? There are no guarantees without adding a certain amount of understanding of data, of variation, and of process performance.

Sometimes predictions fail, as they did colossally before the financial crisis of 2008, in spite of available data that might have suggested the economic downturn that ensued. Looking back, in fact, it appears that there was substantial data to hint at the possibility of this collapse, but it was apparently ignored in favor of more positive possibilities. Statistician and master of predictions Nate Silver suggests that something else is at work: “We focus on those signals that tell a story about the world as we would like it to be, not how it really is.”

While data analysis using control charts offers a path to understanding a process, it does not guarantee accuracy in predicting how a process will continue to work. This statement may seem like heresy, but there are indeed conditions that militate against accurate understanding of process performance. Three mistakes that are often made without complete understanding of data, variation, and process performance are misattribution of causes, insufficient data, and too little process knowledge.

1. Misattribution of causes: Assigning causes without getting the full picture

Noticing points out of control on a chart offers the opportunity to evaluate contributing causes to these points. Annotating charts to reflect this evaluation is a common practice. “New product released” may explain sales figures that are clearly outside control limits, and may stimulate a need in the long run to recalculate these limits, if the pattern persists. Points below control limits may be annotated as well: “Machine repair,” or “Power failure” can explain downtime that created an out-of-control situation.

But without deeper analysis of a shift, the ability to predict future stability of the process is diminished; the shift in data generated by the process may have resulted from a variety of intermittent causes. For example, a downward trend might be attributed to an economic downturn. Coincidentally, a quality issue may have started to affect new customers. When computing new limits, the reason for doing so is important and might be worth noting on the chart—so that a few months later, when the economy starts to boom but your sales do not, you can go back and examine your assumptions.

What else had been happening at that moment that may have affected the process? It may be easy to apply the rules that help to simply identify instability in the process—patterns of data points, for example. But ensuring stability in the future demands thoughtful attention to the forces that have contributed to changes.

2. Too little data: Jumping to conclusions without sufficient support

Although we may know that one can’t predict next week’s weight based only on this week’s, we tend to do this anyway. Panicking at the sight of several data points out of control invites tampering with a process without sufficient evidence for change. This kind of data-point mentality not only falsely bases predictions on only a few data points, but can also inspire actions that are unproductive and even harmful.

The same is true in process management; tampering with a process because a few data points seem to suggest an unfavorable outcome is commonplace. We’ve all adjusted the thermostat in a room when a single observation suggests that it’s colder than usual, without collecting data over time to identify trends, and then taking actions based on these trends that will create more comfortable temperatures throughout the day. A knee-jerk response to perceived cold in a room may lead to another adjustment response that creates an environment that is too warm. The common wisdom is that more than a handful of data points—some say at least 25—must be recorded before an analysis can be said to be accurate.

What has become known as Big Data can identify patterns of behavior—from how many steps per day you take to what clothing styles you prefer—because these predictions are based on mountains of data collected from fitness devices or web shopping sites. After all, buying a red car and a red dress and planting red carnations in the same week does not suggest that your future purchases will all be red. If, on the other hand, the entire city of San Diego has made the same kind of purchases, there may be something there.

3. Not enough knowledge: If you don’t understand your process, you won’t understand your data

However, just having a lot of data is not the Holy Grail. In one example, competing machine-learning models were both looking at the same pile of data. A small tweak in initial settings generated accurate predictions by one model, and specious ones by the other one. So this identifies another possible mistake: not having the right knowledge about the algorithms and the data—and this in turn leading to a wrong decision or application. Thinking of control charts, this might equate to something like calculating limits with data that are not in chronological order. Resulting control limits and charting will not reflect the reality of any process that produces results over time.

Knowledge of control charts depends on understanding and interpreting meaning in their patterns. Without knowing that a pattern of most data points falling near the mean is a signal to investigate further, one might conclude that this pattern is a positive one. If all points are on the same plane, the pattern is far from healthy, as an understanding of common-cause variation will indicate. Basic statistical tools are essential to the process of predicting process outcomes, and understanding variation lies at the heart of this process.

While we may soon find ourselves in driverless cars, happily trusting the built-in technology of the machine to get us where we want to go, it’s important to remember that control charts are never driverless. To make accurate predictions about a process, it will take an understanding of the process, the data, and the statistical tools that support analysis and ensure accurate predictions.

Discuss

About The Author

Barbara A. Cleary’s picture

Barbara A. Cleary

Barbara A. Cleary, Ph.D., is a teacher at The Miami Valley School, an independent school in Dayton, Ohio, and has served on the board of education in Centerville, Ohio, for eight years—three years as president. She is corporate vice president of PQ Systems Inc., an international firm specializing in theory, process, and quality management. She holds a masters degree and a doctorate in English from the University of Nebraska. Cleary is author and co-author of five books on inspiring classroom learning in elementary schools using quality tools and techniques (i.e., cause and effect, continuous improvement, fishbone diagram, histogram, Pareto chart, root cause analysis, variation, etc.), and how to think through problems and use data effectively. She is a published poet and a writer of many articles in professional journals and magazines including CalLab, English Journal, Quality Progress, and Quality Digest.

Comments

Great points!

You make some great points...I always tell managers that the control chart only shows them what is likely to happen--if nothing changes. "Unless something unusual happens, your process measure will probably continue to run between [lower  limit] and [upper limit], and closer to [centerline] than farther away."