Featured Product
This Week in Quality Digest Live
Management Features
Gleb Tsipursky
Belief that innovation is geographically bound to office spaces is challenged by empirical evidence
Andy J. Yap
When organizations merge, people must come together
Gene Russell
Resources to help increase your financial literacy
Michael King
Augmenting and empowering life-science professionals
Meg Sinclair
100% real, 100% anonymized, 100% scary

More Features

Management News
For companies using TLS 1.3 while performing required audits on incoming internet traffic
Accelerates service and drives manufacturing profitability
New video in the NIST ‘Heroes’ series
A tool to help detect sinister email
Developing tools to measure and improve trustworthiness
Manufacturers embrace quality management to improve operations, minimize risk
How well are women supported after landing technical positions?

More News

Davis Balestracci


The Good News—and Bad News—About DOE

Think of the principles in building a table

Published: Thursday, June 16, 2016 - 14:37

In my last column I explained how many situations have an inherent response surface, which is the “truth.” However, any experimental result represents this true response, which is unfortunately obscured by the process’s common-cause variation. Regardless of whether you are at a low state of knowledge (factorial) or a high state of knowledge, the same sound design principles apply.

The contour plot: a quadratic ‘French curve’

Response surface methodology’s objective is to model a situation by a power series truncated after the quadratic terms. In the case of three independent variables (x1, x2, x3), as in the tar scenario from my column, “90 Percent of DOE Is Half Planning,” in May 2016:

Y = B0 + (B1 x1) + (B2 x2) + (B3 x3) + (B12 x1 x2) + (B13 x1 x3) + (B23 x2 x3) + [B11 (x1**2)] + [B22 (x2**2)] + [B33 (x3** 2)]

Which designs give the best estimates of these B coefficients?

(Factorial designs can estimate all except those for the individual (x1**2), (x2**2), and (x3**2) terms)

Bottom line
A response surface design is like building a tabletop. the two essential questions you must answer are: How many legs are needed, and where should they be placed for best stability? I often see people running large numbers of experiments that are pretty much equivalent to either putting many legs along the diagonal of the table or bunching them all in the center. How stable is that? Intuitively, wouldn’t just four legs, one at each table corner, do the job?

It’s also a common tendency to run an experiment based only on its immediate predecessor. That would be sort of like putting legs under the table one at a time and saying, “No, that didn’t work. Let’s put a leg over there.”

Especially in the case of optimization, trying to zero-in intuitively with finer grids risks the very real danger of interpreting common cause variation as an effect (as in the nitrite variable in the tar scenario. The DOE counterintuitive strategy is to use reasonably wide (but not too wide) variable ranges. For example, a 10°-temperature range as in the tar scenario (55°C to 65°C) usually works well.

One chemist I worked with wanted to generate a contour plot based on a Japanese patent where the stated temperature range was –20°C to 210°C. But she was at a low state of knowledge. What she really needed was some initial experimentation to quickly get to a practical operating space for possible manufacture, and then get the contour plot.

There are only so many things a quadratic approximation can model, and a 230°C-temperature range isn’t one of them. And then there are the other possible variables to consider: how many to test and what ranges?

Planned experimentation has higher “quality” because human variation has been reduced. Before one experiment has been run, all involved agree on:
• The design’s efficiency for their specific objectives
• The specific experiments to be run and in what order
• The analysis/interpretation method (That’s the good news.)

The bad news: the price one pays for efficiency

If your eyes begin to glaze over from the following math, skip to the “Bottom line.”

Suppose you have two weights (A and B) and want to determine their mass as accurately as possible. There’s a balance available, but you get only two chances to use it. What do you do?

When I first heard this many years ago, I thought, “Duh, weigh one, then the other.” So, given the fact that even the most sensitive balances have measurement error (σm), the weighings yield:
A + σm
B + σm

Now, what if you:
1. Weigh both A and B together: (A + B) + σm (1)
2. Put A in one pan of the balance and B in the other: (A – B) + σm (2)

Add (1) and (2): 2A + √[(σm **2) + (σm **2) ] = 2A + (√2) σm
(Remember: Variances add when you sum two numbers.)

Divide by 2: A + 0.5 (√2) σm = A + 0.71 σm

Subtract (2) from (1): 2B + √[(σm ** 2) + (σm **2 )] = 2B + (√2) σm
(Variances also add even when you subtract two numbers.)

Divide by 2: B + 0.5 (√2) σm = B + 0.71 σm

Doing things this way gives you ~30-percent more accuracy—but you had to complete both weighings to get the answers. Neither piece of data by itself is useful, but both were used for each calculation.

Suppose you wanted the same accuracy by weighing A and B separately? That requires two independent readings each, then taking their respective averages. For example, weight A:
(A + σm ) + (A + σm ) = 2A + √[(σm **2) + (σm **2) ] = 2A + (√2) σm

Averaging: A + [(√2) ÷ 2] σm = A + 0.71 σm

But this takes four total weighings when two sufficed using the original procedure. And note that calculating each weight’s average doesn’t use the other weight’s two observations—which is waste.

Bottom line
Good designs are efficient in two ways:
• All the data are used in every calculation, which creates a built-in hidden replication that reduces variation optimally. In this case, two weighings did the work of the four.
• The bad news: The design needs to be completed to make the data useful.

Keep this design in your back pocket

Suppose DOE is a new concept to the organization? All the cynics (i.e., most people) are watching you closely. You need a quick success.

Consider a situation similar to the tar scenario: This was a relatively mature process with high state of knowledge, and over the years, temperature, copper sulfate concentration, and excess nitrite had evolved as the three most important variables used to control the process.

I would use the following design. It’s known as a three-variable Box-Behnken design. The following example is shown with its coded (i.e., geometric) form on the left and actual variable settings on the right (–1 represents the low setting of the variable and +1 represents the high setting). It’s structured as running all possible 2 x 2 factorial designs for the three possible independent variable pairings (x1 and x2; x1 and x3; x2 and x3) while holding the other variable at its midlevel.

These 15 experiments ideally should be run in a random order. Note that the data point with each variable at its midpoint (i.e., “center point”) is run three separate times. Inherent in the analysis is using these replications to see whether one has chosen a region too large for a quadratic approximation to be appropriate (which happens more often than one might think), called a “lack-of-fit” test.

(This is only one diagnostic of the generated equation. There are four. Any good regression course should teach them.)

The three-variable Box-Behnken design isn’t the only option. However, for my money, this should be a “bread-and-butter” design for all quality practitioners. Think about it: generating the response surface for three independent variables with the efficient use of only 15 experiments!

The numbers themselves won’t necessarily be exact, but this design will get you to the approximate right place. This is especially important for those of you in manufacturing who many times have to stop a production line to experiment, then deal with the constant, impatient, “Don’t you have the answer yet?” You can get back into production and then fine-tune the process by other experimental methods.

Many of you have been taught basic factorial design and might initially be tempted to optimize the tar process using a basic 2 x 2 x 2 factorial experiment. This would present some problems. A factorial experiment can get only the linear and interaction terms of the quadratic approximation, not the individual quadratic terms. If you try to optimize using the factorial results, it can only lead you to one of the cube’s corners, which in this case would be incorrect.

If a quick answer is required, this particular Box-Behnken design is a “damn the torpedoes, full speed ahead” strategy where you commit to run—and finish—the full design, fit the full quadratic equation (with appropriate diagnostics), and plot the results. There’s no more efficient strategy for three variables.

By the way, two variables require 13 runs; four variables, 27 to 30; five variables, 32 (not a misprint). If you have two variables, you may as well study three; if you have four variables, you may as well study five.

In a 1986 CHEMTECH article, C. M. Hendrix warns that there are “16 ways to mess up an experiment,” and I’ve experienced all of them. For this design, the traps are:
• “Best experiments must be run first”... (“so we hopefully don’t have to run the rest of them”).
• “All experiments must yield good results and product we can sell, and the latest results must show that we’re getting better.” (And after you try this impossible task, they will say, “Told you this wouldn’t work.”)
• “We don’t want to get too many bad results.” (“People might laugh.”)

I can’t tell you how many times I haven’t heard back from a client after carefully designing a study. When I inquire, they say, “My boss didn’t like the idea of running [x] experiments. S/he made me cut out one variable, but thankfully, we had the answer after six runs.” Sigh....

So, where do factorial designs fit into the DOE picture? More about that next time.


About The Author

Davis Balestracci’s picture

Davis Balestracci

Davis Balestracci is a past chair of ASQ’s statistics division. He has synthesized W. Edwards Deming’s philosophy as Deming intended—as an approach to leadership—in the second edition of Data Sanity (Medical Group Management Association, 2015), with a foreword by Donald Berwick, M.D. Shipped free or as an ebook, Data Sanity offers a new way of thinking using a common organizational language based in process and understanding variation (data sanity), applied to everyday data and management. It also integrates Balestracci’s 20 years of studying organizational psychology into an “improvement as built in” approach as opposed to most current “quality as bolt-on” programs. Balestracci would love to wake up your conferences with his dynamic style and entertaining insights into the places where process, statistics, organizational culture, and quality meet.


Steal away!

Thank you for your very kind feedback.  I'm glad it will improve your teaching of DOE. We're all colleagues -- so by all means, steal the table top analogy!

Stay tuned: there are four more articles to follow in this series. When I said in my April Fool's day article that DOE was maybe the only thing salvagable from most statistical training, I thought I'd better weigh in with some vital basics that can easily get lost -- and there are quite a few!


Excellent Article

I picked up a few teaching things today - always a good day when you teach an old dog a new trick or two.  My default is central composite designs for the simplicity and sequential nature that you can take when on the DOE journey.  However, I've been faced with "gotta do it all in one design" and you've convinced me to look at BB designs instead.  Also like the legs on the table analogy - I am definitely stealing that - I've seen that way too many times.


I liked this article a lot...for too many reasons to list. Thanks!