Featured Product
This Week in Quality Digest Live
Quality Insider Features
Jennifer Chu
High-speed experiments help identify lightweight, protective ‘metamaterials’
James Chan
Start the transition to preventive maintenance
Mark Rosenthal
The intersection between Toyota kata and VSM
NVision Inc.
Scanning plays a role in extending life span and improving design of A/C systems
Patrice Parent
Integral components of an electric vehicle’s structure are important to overall efficiency, performance

More Features

Quality Insider News
Maintain cleaning efficacy in varying processes without PFAS and HFCs
New tool presents precise, holistic picture of devices, materials
Enables better imaging in small spaces
Helping mines transform measurement of blast movement
ACE 2024, March 4–7, 2024, Fort Worth, Texas
Handles materials as thick as 0.5 in., including steel
Presentation and publication opportunities for both portable and stationary measurement leaders
HaloDrive Omnidirectional Drive System for heavy-duty operations
For companies using TLS 1.3 while performing required audits on incoming internet traffic

More News

Davis Balestracci

Quality Insider

Time to Do a Root Cause Analysis...

...On the obsession with root cause analyses

Published: Tuesday, November 18, 2014 - 14:55

During my recent travels speaking at conferences and consulting, root cause analysis (RCA) seems to have taken on a life of its own and is now a well-established subindustry in any organization, regardless of its chosen approach to improvement.

There are many things that “shouldn’t” happen. Why not consider such incidents as undesirable variation and get back to basics? One of Deming’s principles was that there are two kinds of variation—common cause and special cause—and that treating one as the other makes things worse.

And the human tendency is to treat virtually all variation as special cause—of which RCA is another example.

Has anyone considered whether things that “shouldn’t” happen might be common cause—as in, one’s organization is “perfectly designed” to have them occur? What might be the effect of multiple RCAs in such cases?

‘Because we worked so hard!’

I was at a conference where one of the poster sessions proudly declared that they had reduced infections in a pediatric unit. It used the following display:

We’ve all seen these. A bar whose scale is the left axis, with a superimposed line whose scale is the right axis (also sometimes presented as side-by-side bars). How come they concluded that they were successful? Because they did 149 individual RCAs and worked very hard to implement everything they had learned. “Obviously,” all this hard work had to be successful, so, as an afterthought, it was just a matter of choosing how to display the data.

I thought of the display this way: The bar was a “count,” and the line graph was the “area of opportunity” for the count. Shouldn’t the correct display be the resulting “rate,” obtained by dividing the former by the latter? I took my best shot at estimating the denominators, and came up with the following run chart:

Click here for larger image.

Looks like common cause to me. Despite all this hard work, nothing had changed—except, perhaps, the addition of a lot of well-intended complexity. And they remain “perfectly designed” to have these infections keep occurring at the same rate.

What should ‘root cause’ really mean?

Quality Progress once published an astute letter by Meredith Brown with a very insightful consideration of RCA. I’ve extracted the key points and will let her do the talking:

“To be of value, root cause analysis should focus primarily on identifying solutions to system design flaws, thereby preventing accidents and failures. It shouldn’t focus on identifying causes, root or otherwise. All too often, however, the causes are based on biases and a lack of critical thinking. In addition, human performance and error factors are frequently overlooked or treated superficially in causal analysis.

“Questions such as, ‘How could he not have seen that was going to happen?’ or ‘How could they have been so irresponsible and unprofessional?’ are retrospective. The only reason they can be answered is because the outcome is already known. They are the result of hindsight bias, which includes a tendency to oversimplify the complexity and ignore the uncertainties of the circumstances people faced when the problem occurred.

“When failure occurs, reactions tend to focus on the people proximal to the accident—those closest in time and space to the mishap. This is also called focusing on the sharp end. By doing so, you miss underlying contributors to the event.

“More than likely, you’ll be able to describe in detail what could have been done to prevent the mishap. But this is a counterfactual representation of the past. Instead of focusing on what actually happened and seeking to understand why it made sense for people to do what they did, all you’re doing is proving that another course was open to them.

“All of these common approaches to causal analysis allow you to judge the actions of those involved, to point out what they should have done, and to condemn them for what they failed to do to prevent the mishap. It’s only a small leap to judge not only the actions, but also the very character of those involved. This is an example of something called the fundamental attribution error.

“A related tendency is the illusion of common sense—the sense that everyone (or at least everyone who is reasonable, intelligent, and informed) shares your understanding and perspective. When you approach event analysis with the belief that what is common sense to you is the right way to see the world, you may fail to understand why people’s actions made sense to them at the time. You will likely be judgmental in your approach and will not uncover important and actionable causes of the event.

“When it comes to causal analysis, organizations tend to be satisfied with a simplified, linear and proximal set of causes. However, studies of accidents in complex socio-technical systems show that they are begging to be looked at more broadly and deeply for causes.

“The human tendency is an expectation of quickly getting to the bottom of things, getting the forms filled out, fixing the problem identified, and getting back to work. Invariably, this leads to solutions aimed at the people involved on the front line—retrain them, supervise them better, or just fire them—rather than at the conditions that led to the event.”

Could this have been a common cause?’

Some food for thought: What if you plotted a run chart of the number of occurrences of all events (weekly, monthly, or quarterly) that resulted in a root cause analysis? If that plot is stable (and I’m willing to bet that most of them will be), what if you aggregated all of your root cause analysis results from these and did a root cause analysis of your root cause analyses (common cause strategy)?

Would “lack of empowerment,” “communication disconnect,” and/or "lack of information” possibly pop out? When someone did something “stupid,” ask the probing question, “Why did it make sense at the time for this person to make such a ‘stupid’ decision?” Was it a matter of competence, or culture? And if it could be competence, was it then a matter of training or even hiring?

These are deeper system issues that could mean you have other, potential and seemingly unrelated “shouldn’t happen” events lurking in other parts of your organization—tick... tick... tick....

And, once again, like a lot of what I teach, this goes against most conventional wisdom. I understand that the pressure is intense for results and accountability when these horrific events occur. But there could be some wisdom in considering the “Don’t just do something, stand there!” strategy to ask, “Was this truly a special cause or could it have been a common cause? Were we actually perfectly designed to have this occur?”


About The Author

Davis Balestracci’s picture

Davis Balestracci

Davis Balestracci is a past chair of ASQ’s statistics division. He has synthesized W. Edwards Deming’s philosophy as Deming intended—as an approach to leadership—in the second edition of Data Sanity (Medical Group Management Association, 2015), with a foreword by Donald Berwick, M.D. Shipped free or as an ebook, Data Sanity offers a new way of thinking using a common organizational language based in process and understanding variation (data sanity), applied to everyday data and management. It also integrates Balestracci’s 20 years of studying organizational psychology into an “improvement as built in” approach as opposed to most current “quality as bolt-on” programs. Balestracci would love to wake up your conferences with his dynamic style and entertaining insights into the places where process, statistics, organizational culture, and quality meet.


Root Cause or System Design Flaws?

First I think this is a matter of semantics, but since we use words to communicate semantics matters.  In my world (and in the intent of Root Cause Analysis) a 'root cause' IS the system design flaw in a physics based problem or a people process based problem.  (I actually prefer the term causal mechanism or causal system as it is rare that there is a single simple factor that causes a Problem)   It is difficult to redesign a system, process or product to prevent a Problem if we don't know exactly what is causing the Problem.

Secondly, I do agree that too often the teaching and use of RCA relys on opinion, fishbone diagrams and multi-colored voting dots to arrive at the 'cause'.   Effective root cause analysis relies on data, solid investigative practices and objective evidence that a factor or set of factors actually cause the Probelm.   The use of opinion is fake RCA, or maybe we should call it Zombie RCA. 

Effective Solution Development is a whole other topic...

Bias for Action

“The human tendency is an expectation of quickly getting to the bottom of things, getting the forms filled out, fixing the problem identified, and getting back to work."  Amen.

I am reminded that those who get ahead have a "bias for action".  I once had a boss who was Director of Engineering for a large corporation.  We shared many interests, but not in terms of method to get results.  He was fast, I was slow.  We finally seemed to hit it off when I told him he was an empiricist and I was a rationalist.  That was, in the course of a year, he and his staff would try a 100 things.  I would study a 100 things and try 10 of them I thought best.  So the question between us was who would produce a better list of successful outcomes.  Don't think we ever drew a conclusion, but our relationship improved considerably.

I had another boss whose typical action in response to a customer problem was to immediately convene a relevant group, go around the table listing probable causes, and then get the group to rank order the most likely causes.  Finally he would solicit courses of action to "solve" the most likely problem and again get the group to rank order the "best" actions.  At this point, usually less than an hour into the meeting, he would return to his office and call the customer to say we were on our way to a solution.  The customers seemed to love this speed, even if the results were off the mark for lack of serious thought and investigation.


Bill Pound, PhD

Sitting at a table listing causes

Your comment regarding your boss describes what I think is the biggest problem in root cause analysis. Empirical data (and looking at the failure) seems to be ignored in most RCA literature and discussions on the topic.

Would you mind if I quote your comment in conference talks and potentially in an article? Normally, I would just quote and use proper references, but I would rather not quote or paraphrase a comment without permission.

Matt Barsalou