Featured Product
This Week in Quality Digest Live
Lean Features
Chris Caldwell
Significant breakthroughs are required, but fully automated facilities are in the future
Megan Wallin-Kerth
Or, how mistakes factor into a kaizen mindset
Eric Whitley
Manufacturing methods and technologies that improve waste management
Donna McGeorge
Design the day for maximum productivity with this Nano Tool
Scott A. Hindle
Part 2 of our series on SPC in a digital era

More Features

Lean News
Embrace mistakes as valuable opportunities for improvement
Introducing solutions to improve production performance
Helping organizations improve quality and performance
Quality doesn’t have to sacrifice efficiency
Weighing supply and customer satisfaction
Specifically designed for defense and aerospace CNC machining and manufacturing
From excess inventory and nonvalue work to $2 million in cost savings
Tactics aim to improve job quality and retain a high-performing workforce
Sept. 28–29, 2022, at the MassMutual Center in Springfield, MA

More News

James J. Kline


Big Data: A Double-Edged Sword

Quality professional organizations need to adjust their body of knowledge to include an understanding of big data

Published: Tuesday, May 31, 2022 - 11:03

Big data is a relatively new phenomenon. Its use is increasing in many organizations. But, as with many new processes, its use cuts both ways. It has positive benefits to both the organization and customers. It also has its potential downside. This piece looks at both with respect to the quality profession.

Big data benefits

Big data is the accumulation and analysis of huge amounts of information (data). This information is generally divided into two categories, structured and unstructured. Structured data includes information electronically stored, notebooks, spreadsheets, and similar information. Unstructured data includes videos, pictures, tweets, and word processing documents.1

The digitization of data and the internet combined with algorithms has created the opportunity for companies and government to be more efficient by identifying and assessing relevant information in a timely manner. Such information allows the organizations to better serve their customers and identify areas that can be improved.

The general term used to describe the integration of big data, the internet, and computer programming with physical objects in the private sector is Industry 4.0. In local government it is known as smart cities. This integration can facilitate product quality improvement, identify defects as they move down the production line, assess supply chain problems, and quickly determine solutions that will prevent significant disruptions or quality issues. Three examples highlight this point.2

The first comes from BP’s Cheery Point Refinery in Blaine, Washington. To better understand how the refining process could be improved, plant managers installed wireless sensors throughout the plant. The sensors provided continuous real-time data on the production process. As a result, it was determined that some types of crude oil are more corrosive than others. This information enabled them to develop better maintenance methods to counteract the problem.

A second is Amazon’s book recommendations. Initially, the approach was to offer tiny variations on the last purchase. However, as more data became available and the algorithms became more sophisticated, the recommendations became based on what other book purchases were correlated with the purchase of that specific book. This provided the customer with a wider variety of choices, personalized the buying experience, and increased company sales.

The last is Walmart. It was one of earliest companies to use big data to track inventory, create an automatic replenishment system, and link it to just-in-time product arrival. This supply chain management allowed Walmart to reduce prices and increase revenue while maintaining product availability and customer satisfaction.

These examples show how big data can be used to improve production process, bolster customer service, and manage supply chains—all of which is beneficial to the organization and quality professionals charged with maintaining product quality and the quality management system.

The challenge is that as big data becomes integrated into business practices, there is an increasing gap in the skills needed to analyze and maintain big data and the skills of many quality professionals. This is because having access to so much data means small group and precision sampling methodologies are less relevant. The amount of information overwhelms the need for precision because the information is available immediately. There is no need to wait for a sample to identify problems—adjustments can be made immediately.

Big data challenge

As noted above, the ability to take advantage of big data has only come about recently with the development of computers, increased computing power, the internet, and huge server farms. However, the methods used to analyze structured data are well developed. In the Amazon book recommendations, for instance, correlations were used in the algorithm. Correlation determination is a standard statistical technique.

The analysis of unstructured data, however, requires additional tools. Further, because of the size of the data available, methods and skills had to be developed to store, sort through, and analyze the data.

Some of the skills are consistent with existing statistically sophisticated quality professionals, such as black belts. But there are also significant differences.

To demonstrate this I pulled, from the internet, a typical assessment of what a data analyst does and the fundamental skills they need.

Succinctly put, a data analyst collects and identifies relevant data, performs quality assurance on the data, and analyzes and presents the results. The hard skills commonly needed are:

• Data collection
• Statistical modeling
• Data mining
• Database management
• Report generation

Initially, entry-level positions required a bachelor’s in math, statistics, economics, finance, computer science, or background in the specific organization’s economic sector. Data analyst certification courses are now being provided by companies like Google for entrance into the profession. Skills using SAS, Excel, Power BI and Tableau, Python, R, and SQL, are also desired. For higher level positions, a master’s or Ph.D. in computer science, mathematics, or statistics is desired.

The table below is a comparison of the skills associated with data analysts and Six Sigma black belts. (The black belt skill set comes from the Six Sigma Handbook3.)

Statistical Skills

Black Belt

Data Analyst

Decision tree






Analysis of variance



Cluster analysis



Control charts



Run charts



Matrix diagram



Activity network diagram



Process capability



Value stream mapping



Associated rule-making



Artificial neural network



Social network analysis



Even though the listing of statistical skills is not extensive, similarities and differences are discernible. Black belts focus on process analysis and structured data. Data analysts focus on separating data sets and evaluating nonstandard unstructured data. As for statistical skills, there is overlap. This raises a question. As big data becomes increasingly important and integrated into the organizational processes, who is more important: a data analyst or a Six Sigma black belt?

This question becomes very important when one remembers that big data does not require precise small-scale product sampling. It relies on large amounts of concurrent information coming in from customers, suppliers, and the production line. Further, with the use of big data in the development of Industry 4.0, defective products can be identified on the production line and immediately removed by a robotic arm.

Except for the statistics specifically associated with process analysis, the data analyst has the statistical skills needed to do much of the analysis a black belt does. What's more, the data analyst is better at dealing with unstructured data. These skills may be more important, depending on the degree of Industry 4.0 implementation in an organization. Consequently, a full-time black belt may not be needed. If they are, they can be brought in.

If black belt skills, the most sophisticated of the quality profession, are not in demand as much as they once were, what are the implications for the rest of the quality profession?


The scenario above may seem scary, but forewarned is forearmed. The quality profession is being disrupted. Technological developments associated with Industry 4.0 and the use of big data are going to require the profession to adapt.

Going forward, every quality professional will have to work within the big data environment. Consequently, you should find out if and how big data is being used in your organization. Become familiar with big data terminology and the skills your organization is requiring with respect to big data analysis. Compare that skill set with your own; where gaps exist, move to fill those gaps.

Specifically, look at statistical skills and your level of knowledge with respect to SAS, Excel, Power BI and Tableau, Python, R, and SQL. The purpose is not to become a big data analyst (unless that is desired), but to be familiar with the processes and techniques to identify the ways that information can be applied to the improvement of product quality, customer satisfaction, supply chain management, and, generally, the quality management system.

More broadly, quality professional organizations such as ASQ and CQI will need to adjust the body of knowledge for every certification to include an understanding of big data, its terminology, applications, and tools. Failure to do so could result in quality professionals being at a disadvantage with respect to big data analysts.


1. Holmes, Dawn. Big Data: A Very Short Introduction. Oxford University Press, 2018.
2. Mayer-Schönberger, Viktor; Cukier, Kenneth. Big Data: A Revolution That Will Transform How We Live, Work and Think. Harper Business, 2014.
3. Pyzdek, Thomas. The Six Sigma Handbook. McGraw-Hill, 2003.


About The Author

James J. Kline’s picture

James J. Kline

James J. Kline, Ph.D., CERM, is the author of numerous articles on quality in government and risk analysis. He is a senior member of the American Society for Quality. A manager of quality/organizational excellence and a Six Sigma green belt, he has consulted for the private sector and local governments. His book, Enterprise Risk Management in Government: Implementing ISO 31000:2018, is available on Amazon. He can be reached at jeffreyk12011@live.com.


Big Profits from Small Data

Big data is often used to figure out how to sell more stuff to smaller and smaller niche markets.

Six Sigma is a problem solving process. It cuts unnecessary, preventable costs, which can boost any bottom line dramatically.

We know from experience that a sample of a population will contain the same patterns of defects, mistakes and errors as the total population.

So we don't need big data to solve big problems. We can get big profits from small data.

I've helped companies save millions of dollars using Excel and QI Macros Improvement Project Wizard.

The biggest spreadsheet I ever used was only 47,000 rows of data, not millions. It yielded $5 million in savings in just a few days.

To learn more about how to do this, download my free ebook at https://www.qimacros.com/pdf/Agile-Process-Innovation.pdf.