Featured Product
This Week in Quality Digest Live
Management Features
Mike Figliuolo
No one needs recurring meetings, unnecessary reports, and thoughtless emails
Etienne Nichols
How to give yourself a little more space when things happen
Gleb Tsipursky
The future of work is here, and AI is the driving force
William A. Levinson
Quality and manufacturing professionals are in the best position to eradicate inflationary waste
Chandrakant Isi
Experts in design and manufacturing describe the role of augmented and virtual reality

More Features

Management News
Recognition for configuration life cycle management
Streamlines the ISO certification process
Nearly two-thirds of HR managers feel AI is changing the skills needed in today’s workplace
On the importance of data governance in the development of complex products
Base your cloud strategy on reliable information
Forecasts S&A subsector to grow 9.2% in 2023
How to consistently make optimal choices in business and life
Embrace mistakes as valuable opportunities for improvement

More News

Bryan Christiansen


Choosing the Right Technique for Failure Analysis

If all you have is a hammer...

Published: Wednesday, October 20, 2021 - 12:03

Learning from past failures is the best way to understand and prevent future equipment breakdowns. In practice, that learning process falls under the umbrella of failure analysis.

These days, there are plenty of failure analysis techniques to choose from. They all come with a specific set of advantages, challenges, and use cases. Let’s see what is available, what steps you need to take, and what are the right techniques for your situation.

What is failure analysis?

Failure analysis is the process of collecting and analyzing failure data, usually to identify the root cause of an asset malfunction or breakdown. This information can be used to improve machine and component design, adjust maintenance schedules, and improve maintenance processes. Ultimately, its goal is to improve asset reliability.

The failure analysis process is generally done after a failure has already occurred. It is an integral part of the root cause analysis (RCA) process. However, it can also be used to determine various factors that could cause a potential failure, so we can select and apply the right prevention methods.

Depending on its purpose, failure analysis can be performed by plant and maintenance engineers, reliability engineers, or failure analysis engineers.

Maintenance engineers conduct primary failure analysis based on their knowledge of the plant operations. If the internal team doesn’t have the required expertise, it is advisable to hire consultants that provide failure analysis services.

Last but not least, reliability engineers employ different failure analysis techniques to improve fault tolerance and ensure the robustness of their system.

Common use cases for failure analysis

The most common reasons to conduct failure analysis are discussed below.

Identifying the root failure causes

In many cases, machine failures are surface-level manifestations of deeper problems that were not addressed in time. Sometimes, a combination of different factors leads to an unexpected breakdown.

Because breakdowns are so expensive and disruptive, maintenance teams need to put a lot of effort into preventing them. Aside from routine maintenance, identifying root failure causes, and eliminating them, is the best way to keep breakdowns at bay.

Preventing potential failures

A machine or system has many interconnected and interdependent components. These components do not have the same probability of causing a systemwide failure. Information and data on the system can be used to analyze the probabilities of potential failures.

Tests and simulations can be run to find the weakest links and improve them, be it through design tweaks or by changing operating and maintenance recommendations.

Improving product design

As mentioned, failure analysis can be done to improve equipment or component design. Engineers can employ different failure analysis techniques to identify potential issues in their designs.

On a more practical side, they can also conduct destructive testing to evaluate the characteristics of components and materials they plan to use in their final product.

The insights gained from these tests and analyses are used to create or improve product quality.

Ensuring compliance

Regulations and standards imposed by governments or industry bodies often require failure analysis. Failure analysis methods are used to ensure the product adheres to the required standards.

Liability assessment

Legal proceedings related to failures require that the cause of a failure is analyzed. The same is done as a part of specific insurance claim settlements to ensure that the conditions in the contract are met. In such cases, failure analysis might be a legal requirement.

Naturally, the result of failure analysis can also be used as protection from litigation.

Steps for conducting failure analysis

Failure analysis techniques vary widely based on the specific use cases. That being said, steps for conducting failure analysis follow the same pattern.

Step 1: Define the problem

A well-defined problem statement is essential for any deep analysis. Failure analysis requires that engineers define the problem as clearly and concisely as possible. The problem statement should contain details about:
• The failure that occurred
• The data that need to be collected
• Failure analysis technique to be used
• The expectations for the failure analysis (i.e., goals)

Step 2: Collect failure data

All relevant data have to be collected. This includes both quantitative data and qualitative data.

Quantitative data refers to the operations data, such as maintenance data, and age of the machine. It can be obtained:
• From maintenance records
• From a CMMS database or any other tool used to monitor asset health and performance through troubleshooting
• By performing a visual inspection (as a part of failure investigation)

Qualitative data cannot be easily quantified. Such data are obtained by interviewing stakeholders like machine operators, maintenance technicians, and operations managers. All relevant data concerning the failure should be collected.

Step 3: Create a failure timeline

Root causes result in a chain reaction that forms the surface-level failures we observe. The collected failure data can shed light on the event sequences that happened. With enough information, the team performing the analysis can create a failure timeline. This serves as a visual and mental aid to the analysis process.

Hopefully, the timeline will provide clarity into the cause-and-effect relationship between the events.

Step 4: Select useful data and discard the rest

The timeline created in the previous step is also used to identify useful data. Quantitative and qualitative data collected in step two are mapped to the events in the timeline. The data that find a place in the timeline are useful for the final analysis.

The rest of the data can be discarded as not relevant to the events that caused the failure. This way, failure analysis teams won’t waste time and effort analyzing irrelevant information.

Step 5: Administer the chosen failure analysis technique

The next step is to conduct the chosen failure analysis technique (we will discuss these in the next section). The method selected depends on the specific use case, industry, and the experience of failure analysis engineers conducting the analysis.

Step 6: Review results, test, and apply a solution

The result of failure analysis is studied in detail. In most instances, the purpose of failure analysis is to implement remedies that can prevent future failures. Different solutions proposed are tested, and the best solution is used to improve the system or machine.

Common failure analysis techniques

Failure analysis is not an exact science. It is an open-minded exploration of the true cause behind failures, and it can be considered a craft.

Still, failure analysis cannot be done without any structure. Over the years, engineers developed quite a few techniques that can be used as a framework to analyze all kinds of failures.

The most popular failure analysis techniques are discussed below.

5 Whys

5 Whys is a simple methodology used to identify cause-and-effect relationships between events. It’s based on asking why the initial problem happened. The first answer then forms the basis for the next “why” question. We keep asking this until we get to something fundamental or completely outside of our control.

5 Whys

Recommended reading: “5 Whys: The Ultimate Root Cause Analysis Tool.”

Fishbone or Ishikawa diagram

Fishbone diagram (aka Ishikawa diagram) is a failure analysis technique that is visualized in the form of a fishbone. The head represents the problem we are analyzing while the bones represent potential causes.

Fishbone diagram

The whole diagram is predicated on the idea that multiple factors can lead to the failure/event/effect we are investigating. It is widely used for process improvement in the medical field, aerospace industry, and IT.

Recommended reading: “How to Use the Fishbone Tool for Root Cause Analysis.”

Failure mode and effects analysis (FMEA)

FMEA is a preemptive failure analysis technique. It is used to predict potential failures with the help of past data and future projections. It takes a look at the potential ways in which a machine fails and the consequences of each identified failure.

FMEA is a preventive fault-analysis technique where each part of a system is brought under the scrutiny of an expert team. It serves as a framework to instigate rigorous brainstorming sessions.

The technique is extensively used in reliability engineering, safety engineering, and quality control.

Recommended reading: “FMEA & FMECA: How to Perform Failure Mode and Effects Analysis.”

Fault tree analysis (FTA)

Fault tree analysis makes use of Boolean logic relationships to identify the root cause of the failure. It tries to model how failure propagates through a system. This helps reliability engineers create well-defined systems with proper redundancies where component failures do not always cascade into systemwide failures.

Fault tree analysis
Image source

FTA is widely used in the aeronautical industry, power generation, and defense.

Recommended reading: “What Is Fault Tree Analysis and How to Perform It.”

Pareto charts

As a rule of thumb, in any system, 80 percent of the results (or failures) are caused by 20 percent of all potential reasons.

The principle is dubbed the Pareto principle (some know it as the 80-20 rule). This skew between cause and effect is evident in many different distributions, from wealth distribution among people and countries to failure causes in a machine.

Pareto charts
Image source

Pareto charts are quantitative tools to identify the root causes that cause the most number of failures. They are widely used in scenarios where multiple root causes must be addressed, but the resources are scarce.

Recommended reading: “How to Conduct Root Cause Analysis Using Pareto Charts.”

Barrier analysis

Barrier analysis is a root cause analysis methodology that determines the barriers to the safety of the target. Here, the target is defined as the component or machine or system that is to be protected from failure.

The various pathways that could cause machine failure are identified. Elements in these pathways that act as barriers to safe operation are determined. They are altered to eliminate the problems in the system.

Barrier analysis
Image Source

Barrier analysis identifies the impediments to successful operations. The barriers are circumvented or eliminated as a result. It is a root cause analysis technique widely used in the IT industry.

Recommended reading: “Sample of a Barrier Analysis for Root Cause Investigations.”

A quick comparison of failure analysis techniques

Below is a quick table that compares failure analysis techniques based on the time needed to train your internal team to use them, how long it takes to conduct each, as well as the main advantages and limitations of the respective failure analysis methods.

comparison of failure analysis techniques

Key takeaways

Failure analysis is a versatile tool that has many purposes. It can be used to investigate past failures, understand failure mechanisms, and predict the modes of future failures.

There is no “one size fits all” solution to conduct failure analysis. The technique selection will depend on the goal of the analysis, available resources, access to relevant data, and what the failure analysis team knows and prefers to use.

First published Sept. 2, 2021, on the Limble CMMS blog.


About The Author

Bryan Christiansen’s picture

Bryan Christiansen

Bryan Christiansen is the founder and CEO of Limble CMMS. Limble is a modern, easy-to-use mobile CMMS software that takes the stress and chaos out of maintenance by helping managers organize, automate, and streamline their maintenance operations.


Failure Analysis - -misuse of 5-Why Analysis

The 5-Why Analysis is to be applied against the Direct or highest ranked or as Toyota say PoC. See Toyota Practical Problem Solving seven steps. The way you have done the 5-Why before the Ishikawa Type A C&E Diagram is incorrect. Toyota and Industrial Engineering, Work Simplification and AIAG FMEA do not do as your say. 5-Why is in support of the Ishikawa Diagrams and Dr Ishikawa has three types of C&E Diagrams.