Featured Product
This Week in Quality Digest Live
Six Sigma Features
Donald J. Wheeler
How does it compare with a process behavior chart?
Eight unique best-practice sessions featuring 11 process improvement and thought leaders
Harish Jose
Learning how to better ask “Why?”
Richard Harpster
Good news? You are probably already doing it.
Donald J. Wheeler
Does your approach do what you need?

More Features

Six Sigma News
Sept. 28–29, 2022, at the MassMutual Center in Springfield, MA
Elsmar Cove is a leading forum for quality and standards compliance
Is the future of quality management actually business management?
Too often process enhancements occur in silos where there is little positive impact on the big picture
Collect measurements, visual defect information, simple Go/No-Go situations from any online device
Good quality is adding an average of 11 percent to organizations’ revenue growth
Floor symbols and decals create a SMART floor environment, adding visual organization to any environment
A guide for practitioners and managers
Making lean Six Sigma easier and adaptable to current workplaces

More News

Peter J. Sherman

Six Sigma

Demystifying Design of Experiments

And related taboos

Published: Wednesday, April 2, 2008 - 21:00

Few process improvement topics generate more questions or may be least understood than design of experiments (DOE). This is regrettable as DOE is probably one of the most important activities—after the charter definition—that a manager/Black Belt will be performing during the Six Sigma process.
This purpose of this paper is to demystify DOE by framing the topic in terms managers can readily understand. I will describe it in business terms, show you how to design and conduct a basic DOE, and finally, discuss how to interpret the results of DOE. To reinforce the concepts, I will walk you through an actual DOE I recently conducted with AT&T. Finally, I’ll present some key lessons learned about DOE.

DOE overview
What is DOE? Why is it used? What are its benefits? Where does it fit within the Six Sigma process? DOE is much more than a set of experiments as the name may imply. The DOE activity generally occurs sometime during the analyze and improve phases of the define-measure-analyze-improve-implement-control (DMAIIC) process. DMAIIC is a rational decision-making process for improving existing processes. DOE also occurs during the analyze and improve phases of the define-measure-analyze-design-verify (DMADV) process. DMADV is a rational decision-making process for building new processes (See figure 1).

Figure 1

In business terms, DOE helps identify and validate those factors that have the most significant effect on a business process, from which the manager or Black Belt will base their conclusions and form their recommendations to senior management. DOE provides a window into the future of what factors are truly significant and have the most effect on a process. Now that we’ve defined DOE in business terms, the technical definition can make more sense. DOE is a methodology of varying a number of input factors simultaneously in a carefully planned manner, such that their individual and combined effects on the output can be identified.

What are the business benefits and advantages of DOE? Managers and Black Belts are responsible for making important decisions every day in their jobs and have limited data, time, and resources. Whether they’re presenting to senior executives of a Fortune 500 company such as AT&T or to a business owner of a smaller organization, they need a means to clearly and succinctly communicate their findings, conclusions, and recommendations in an unbiased, fact-based manner. A properly designed and conducted DOE provides that fact-based approach for them to make informed decisions with confidence—given limited data, time, and resources.

As a senior Black Belt on our project at AT&T, I knew that for us to secure funding for our process improvement, we had to make our case to senior management in a compelling manner where the results would speak for themselves. A DOE was our approach. To summarize, the benefits of DOE include:

  • It allows the manager or Black Belt to efficiently identify the critical factor(s) affecting your process. In fact, DOE permits multiple factors to be evaluated simultaneously.

  • DOE is economical and cost effective for the reasons described above.

  • DOE can be designed to have minimal disruption to normal business operations

Designing and conducting a DOE:
DOE is a process and like any process has a prescribed set of steps to design and conduct it. I have synthesized the key steps below.

Step 1
Set your objectives: Are you trying to reduce cycle time for constructing a new office building? Perhaps you’re seeking to increase the occupancy rate for a hotel. Maybe you need to reduce average handling time of your help desk. The key is to be specific in terms of how much improvement. At AT&T our goal was to reduce the number of nonbilled inside wire residential dispatches. It seems technicians weren’t properly billing customers for repair work involving wiring and jacks in the home. Specifically, the objective was to reduce nonbilled dispatches by 10 percent or approximatelly 6,300 per year.  This translates into $600,000 incremental revenue per year.

Step 2
Identify the major sources of variation: This is typically performed through brainstorming activities using cause-and-effect diagrams or more statistically-oriented tools such as Pareto charts and analysis of variation (ANOVA) results. Think in terms of internal (i.e., within the business) and external sources (i.e., outside the business including regulatory issues, unions, price of fuel, weather, credit constraints, etc.). The key is to select the most important factors within your control. For example, it makes no sense to try to plan for daily weather patterns if you’re in the construction business. In my project at AT&T, the team utilized a fishbone diagram to brainstorm the major sources of variation/causes of why inside wire repair jobs wouldn’t be properly billed. (See figure 2.) We concluded that the tools, specifically the TechAccess software used by technicians to perform billing, was a key source of variation. It provided too much latitude for the technicians in their decision to bill vs. not bill a customer.

Figure 2

Step 3
Rank, prioritize, and select factors: A natural outcome of identifying major sources of variation or root causes is the generation of improvement ideas or factors that involve the sources of variation. Because it’s unwise and impractical to conduct your DOE using all the factors identified, you will need to rank, prioritize, and select them in an organized manner. Our team utilized a simple but effective evaluation matrix that incorporated three key criteria with different weightings:

  • Effectiveness: Ability to generate revenue (50 percent)

  • Time to implement (30 percent)

  • Cost (20 percent)

We assigned numerical scores from 1 (worst) to 10 (best) for each factor depending on how it corresponded to the three key criteria. In our project, we selected factor 2 (Modify TechAccess—Option B) which was the highest-scoring factor (6.7 weighted average). Factor 2 was more proactive in nature than the other factors and was believed to be more sustainable because of the built-in control mechanism. For example, if a technician selects a nonbill code for a legitimate reason (i.e., BellSouth’s fault or customer dissatisfaction), an e-mail is automatically generated to the supervisor requiring real-time approval. (See figure 3.) Other projects may require two or more factors to be tested.




Effectiveness (Generate revenues)

Time to Implement


Weighed Average









Scores range from 1 (lowest) to 10 (highest)










Modify TechAcess - Option A

Change TechAcess script to clearly identify customers NOT subscribing to a maintenance plan (i.e., pop-up menu, verbal)





While the pop-up display and verbal announcement on TechAcess will inform the technician of a nonmaintenance plan customer, the technician can still decide to nonbill.


Modify TechAcess - Option B

Change TechAcess flow to invoke an override screen if 1211, 1212, or 1213 nonbill disposition code is used and if customer does NOT have a maintenance plan. If technician chooses to override code, an e-mail notification is sent in real-time to supervisor requiring approval.





Technician will be required to call supervisor before proceeding with nonbill disposition code.



Provide mandatory training on current billing procedures for network managers and technicians





Past experience with refresher training has indicated it is effective in the short term. However, if conducted on a regular basis, training could be a cost-effective option.


Reinforce compliance policies

Issue memorandums to all field technicians and supervisors to perform routine audits of inside-wire dispatches.





Past experience with compliance enforcement has indicated it is effective in the short-term, but quickly deteriorates with time (i.e., Hawthorne Effect). This is a reactive approach that is time-consuming. Also, technicians have learned to "game" the system.


Regular audits by accounting

Accounting to systematically audit inside wire dispatch billing codes on regular basis.





Auditing is time-consuming and not proactive in nature.

Step 4
Develop the hypothesis testing: In this step you’re simply stating your beliefs—in statistical terms—of how you believe the new process or factor(s) will affect the old or existing process. Are you trying to prove that your new process will reduce the old process (i.e., construction time of the office building or average handle time of the help desk)? Are you trying to prove your new process will increase the old process (i.e., occupancy rates in the hotel)? Or maybe you want to determine if the new process simply has an effect on the old process.

In our project, we were interested in seeing if our new modified billing software would result in fewer nonbilled inside wire repair jobs than the old software. To fully understand the concept of hypothesis testing I need to introduce some technical terms:

Null hypothesis (H0): Denotes the mathematical relationship between the old and the new, assuming that the old process is true.

For our project, the null hypothesis was as follows:

(Old) Georgia population % Nonbilled dispatches < (New) Georgia sample % Nonbilled dispatches.

Alternative hypothesis (H1): Denotes the mathematical relationship between old and the new, assuming that the old process is false.

For our project the alternative hypothesis was as follows: (Old) Georgia population % Nonbilled dispatches > (New) Georgia sample % Nonbilled dispatches.

There are only three possible forms to express these hypothesis tests:
H0: Old = New                        H0: Old ≤ New                        H0: Old ≥ New
H1: Old ≠ New                        H1: Old > New                        H1: Old < New

Risk level (alpha α): Quantifies in percentage terms the degree of risk you are prepared to take assuming you incorrectly reject the old process when it is true. Risk levels depend on the type of industry and particular process that you are dealing with. For example, in the pharmaceutical industry, where human lives are at stake, drug companies typically use an alpha of 1 percent. In my own industry of telecommunications, 5 percent alpha is generally acceptable. The risk level determines the level of confidence (1 – α) in correctly accepting the old process when it is true.

For example, in our project we selected alpha to be equal to 5 percent. Therefore, the level of confidence equals 95 percent.

Critical value:  The critical value is a numerical value determined by the particular alpha and can be looked up in statistical tables. The critical value is compared to a calculated value that is calculated after generating the results of your DOE.  How do you use these two values?

  • If the absolute value of the calculated value is greater than the critical value, reject the null. In other words, the new process is significant.

  • If the absolute value of the calculated value is less than the critical value, fail to reject the null. In other words, the new process is not significant.

Think of the critical value as a “hurdle bar” on a track field. The lower your risk level, the higher the critical value becomes.

In our project, the 5-percent alpha equates to a 1.645 critical value in a normal Z-table (assuming you’re testing to see if the new process is lower or greater than the old process). On the other hand, a 1-percent alpha equates to a 2.33 critical value. In summary, there are three steps to perform in hypothesis testing:

  • State the null and alternative hypothesis

  • Specify the risk level (alpha) and corresponding level of confidence

  • Determine the critical value based on alpha

Step 5
Select an experimental design: There are a multitude of experimental designs to choose from in DOE. Here, we will focus on one of the most common and useful designs that can be applied to most process improvement projects: a 1-factor matched comparison (a/k/a pre test vs. post test, control vs. experimentation). It’s particularly useful for analyzing whether a new procedure improves accuracy or time.
In our case, it was appropriate because we were trying to determine if our new modified billing software would result in fewer nonbilled inside wire repair jobs than the old software.

Step 6
Execute the design: Although there are nine states in the old BellSouth region (prior to being acquired by AT&T), we didn’t have the luxury of time or resources to perform the DOE in all nine states. Instead, we focused on one state, Georgia, which had a similar rate of percentage nonbilled inside wire repair jobs compared to the entire region. We kept the trial instructions simple and straightforward allowing the technicians to continue with their day-to-day operations. This helped convince the supervisors and technicians to work with our team. The trial consisted of 24 randomly selected technicians in Georgia and ran for five weeks to accumulate sufficient data. (See figures 4 and 5)

Figure 4

Figure 5

Step 7
Analyze and interpret the results: After the trial we compared the results of the new software process with the old process.  The new software process reflected a 12.73 percent rate of nonbilled inside wire repairs compared to 19.77 percent with the old process. While this 35.6-percent reduction seemed significant, we needed to determine if this was a statistically significant improvement at our 5-percent risk level (or 95-percent confidence level).

In our project, the calculated statistic was –1.855 and the critical value was 1.645. Using the hypothesis decision rules, we rejected the old process (absolute value of –1.855 > 1.645) and concluded the new software process significantly reduced nonbilled inside wire repair jobs at a 95-percent confidence level. Figure 6 shows the complete step-by-step process we used to determine the hypothesis testing and interpretation of results.

Figure 6

Our team set about to estimate the cost of modifying the billing software and project the incremental revenues. The cost of the modification was $53,000, while the projected annual incremental revenue based on the DOE pilot was more than $2 million. We knew that we had a compelling case to present to senior management. Armed with the data from our DOE pilot, we were able to confidently state our conclusions and recommendation. The meeting lasted less than 30 minutes. We had our “go” decision and funding. The key lessons learned of that experience included:

  • Keep the DOE experiment as simple as possible

  • Obtain buy-in from all parties involved

  • Check that all planned runs are feasible

  • Record everything that happens

  • Use common sense


About The Author

Peter J. Sherman’s picture

Peter J. Sherman

Peter J. Sherman is a managing partner of Riverwood Associates, a lean Six Sigma certification training and consulting firm based in Atlanta. Sherman brings more than 20 years experience in designing and implementing process improvement programs. Sherman is a certified lean Six Sigma Master Black Belt, an ASQ-certified quality engineer, and an APICS-certified supply chain professional. He holds a master’s degree in engineering from MIT and an MBA from Georgia State University.