Featured Product
This Week in Quality Digest Live
Statistics Features
Donald J. Wheeler
What does this ratio tell us?
Harish Jose
Any statistical statement we make should reflect our lack of knowledge
Donald J. Wheeler
How to avoid some pitfalls
Kari Miller
CAPA systems require continuous management, effectiveness checks, and support
Donald J. Wheeler
What happens when the measurement increment gets too large?

More Features

Statistics News
How to use Minitab statistical functions to improve business processes
New capability delivers deeper productivity insights to help manufacturers meet labor challenges
Day and a half workshop to learn, retain, and transfer GD&T knowledge across an organization
Elsmar Cove is a leading forum for quality and standards compliance
InfinityQS’ quality solutions have helped cold food and beverage manufacturers around the world optimize quality and safety
User friendly graphical user interface makes the R-based statistical engine easily accessible to anyone
Collect measurements, visual defect information, simple Go/No-Go situations from any online device
Good quality is adding an average of 11 percent to organizations’ revenue growth
Ability to subscribe with single-user minimum, floating license, and no long-term commitment

More News

Davis Balestracci


Count Data: Easy As 1-2-3?


Published: Monday, August 14, 2017 - 12:03

Many of you work in organizations that keep track of customer complaints. Have you ever thought of how they are recorded and tallied? What could possibly be wrong with this process: The customer brings a concern to your attention. Record it.

Let’s say a certain pediatrics unit reported the number of concerns on a monthly basis. The values for one period of 21 months were, respectively, 20, 22, 9, 12, 13, 20, 8, 23, 16, 11, 14, 9, 11, 3, 5, 7, 3, 2, 1, 7 and 6. But even though you know the counts, you don’t know the whole story because you don’t know the context for the counts. Before anyone can make sense of these counts, certain questions must be answered.

How is “concern” defined? Is it just an antiseptic term for complaint? Are these customer complaints, internally generated counts, or a mixture of both? In the data above, why does the number of concerns drop? And what about the rumor that the hospital administrator is using these numbers to rank departments for bonuses? What exactly constitutes a complaint? Does a complaint about a chilly reception room count?

What else does your organization “count” besides complaints?

Do you realize that careful instruction is needed for people to be able to collect data like these to make them useful? “Human variation” lurks to seriously compromise their quality and use.

A clear objective should drive such a collection to determine 1) what should be recorded; and 2) the threshold that causes something to go from a nonevent (x = 0) to a countable, tallied event x = (1).

The more specific, the better

As another baseball season proceeds here in the United States, it occurs to me that the sport might be one of the best examples for giving extremely specific criteria for tallying and then using count data for comparison and prediction. Not only that, when it is felt that a certain criterion isn’t reflecting its original intent—usually some ranking of ability for relative comparison—baseball is good at redefining the operational definition and retrofitting the new definition on past data.

Consider the evolution of the indicator “save.” The term was being used loosely as far back as 1952. Some coaches thought it reflected the particular skill of a relief pitcher to successfully maintain a lead until the end of the game, regardless of the margin of victory.  Because these players could not be credited with the win, record keepers wanted some recognition of this particular skill to somehow compare individual relief pitchers’ abilities to do this. The statistic went largely unnoticed.

A formula with more criteria for saves was invented in 1960 because it was felt that the major existing comparative statistics at the time—earned run average (ERA, the average number of runs a pitcher gives up over nine innings) and win-loss record (W-L)—did not sufficiently measure a relief pitcher’s effectiveness.
• ERA does not account for runners already on base when a reliever enters the game. Officially, they are the responsibility of his predecessor, but it’s also the incoming relief pitcher’s job to keep them from scoring. And some pitchers are much better at this than others. How could this be reflected?
• Win-loss record does not necessarily account for a relief pitcher’s skill at protecting leads. One case in 1959 was particular motivation for closer scrutiny of the need for such a statistic. One relief pitcher had a win-loss record of 18–1, which on the surface looks impressive. However, 10 of the 18 wins were games when he came in, gave up the lead, and fortunately for him, his team regained the lead and won the game—hardly a person who instills confidence in his team!

In 1969, “save” was once again redefined and became an official baseball statistic. In 1974, people felt that the 1969 definition did not adequately reflect the purpose for which it was intended. Tougher criteria were adopted, but use of this revised definition showed it to be too stringent. It was redefined once more in 1975 with four very specific criteria, all of which must be met:
• He is the finishing pitcher in a game won by his team
• He is not the winning pitcher
• He is credited with at least one-third of an inning pitched
• He satisfies one of the following three conditions: He enters the game with a lead of no more than three runs and pitches for at least one inning; or he enters the game, regardless of the count, with the potential tying run either on base, at bat, or on deck; or he pitches for at least three innings.

These revised criteria were clear enough that situations before 1969 could be retrofitted and reevaluated. There was finally agreement that the resulting set of numbers now allowed fairer comparisons and rankings.

Do you note an informal, implicit, but very distinct use of the plan-do-study-act (PDSA) process?

Have you ever done PDSA on your data measurement processes?

There is no true value of anything.” —W. Edwards Deming

Poor Pluto! Due to the discovery of increasing numbers of heavenly bodies, astronomers’ attempts to classify them resulted in Pluto being stripped of its planet status. Regardless of any definition, Pluto does indeed continue to exist—but not necessarily as a “planet.” Many people begged for its reinstatement. Astronomers made a startling discovery: In one case, if criteria are applied that declare Pluto a planet, 100 other heavenly bodies suddenly also become planets.

In counting things, one comes up with a process to look at a situation, apply criteria, and conclude whether some threshold was crossed (1) or not (0). Using different criteria could result in a different conclusion. No measure will ever be 100-percent perfect

Criteria should be robust enough so that the same decision is made regardless of who evaluates the situation (healthcare people: think “chart review”). The real test then becomes whether subsequent use of the number—using this definition—fits the original objective and allows the appropriate desired action. Especially in improvement work, it may never be perfect, but is it good enough?

The count itself is only half the issue

Obtaining the count itself, though important, is still limited information. In the baseball save example, if pitcher No. 1 has 24 saves and pitcher No. 2 has 37 saves, is the latter a better relief pitcher? Not necessarily. How many opportunities did each have?

To interpret and compare counts, one must also know the area of opportunity for each count. 

Last month’s column showed an analysis where the area of opportunity was distinct, discrete, and one off: For each individually dispensed drug, one could tally that it was either the targeted expensive drug (x = 1) or not (x = 0). That results in a percent, which I analyzed using a p-chart analysis of means.

But what if the area of opportunity is not discrete? For example, when a patient has an inserted catheter or central line, the opportunity for infection is ever present for as long as the line stays in. The longer it stays in, the more window of opportunity to have a potential infection manifest. The window is “continuous” and is best expressed as some type of rate. More about that next time.

Meanwhile, track down some important count data in your organization and find out:
1. Who collects it (and are there multiple collectors?)
2. Whether the collectors know what it’s used for
3. Whether they agree on the definition
4. What actions are taken because of these data
5. Whether these actions are consistent with the collectors’ view
6. Whether fear could affect collectors’ reporting

U.S. healthcare folks: Who is tallying the data for your organization’s dreaded 15 to 20 CMS (Centers for Medicare/Medicaid) goals?

This recently in

It seems a “surprisingly high rate” suddenly popped up, as this news item attests:

“Colonoscopy complications occur at surprisingly high rate... approaching 2% within a week of ’scoping”

So [Harlan Krumholz MD], whose team has multiple Medicare contracts to develop pay-for-performance measures for healthcare settings, went to work. . . . [T]he team [came up with a definition, used four states’ past year’s data and] found wide variation in the rates of emergency visits, and hospitalizations across facilities, from 8.4 per 1,000 up to 20.

I am shocked! Calculated similarly? I doubt it. Was there a clear objective for recording these data in the first place? What about the other 46 states? It’s a good start, but the first question must be, “What’s ‘wrong’ with any available data?” This would improve subsequent collection, which would enable more appropriate analyses and comparisons. Considering potential consequences on physicians’ reputations, wouldn’t one think they are owed such a courtesy? There are currently Centers for Medicare and Medicaid (CMS) goals for at least 11 such indicators. The article continues:

“CMS said in its specifications manual that such transparency ‘will reduce adverse patient outcomes associated with preparation for colonoscopy, the procedure itself, and follow-up care by capturing and making more visible to providers and patients all unplanned hospital visits following the procedure.’* Eventually the measure will probably be used to determine amount of Medicare reimbursement to those facilities.”

*Define “specifically for this objective,” please? Will you do an initial test to make sure there is consistency in its interpretation and recording? “Human” variation is lurking to contaminate and possibly distort it.

“It also will provide ‘transparency for patients on the rates and variation across facilities in unplanned hospital visits after colonoscopy,’ CMS said in its rulemaking documents.

“The intent is ‘not to put a label on a facility that looks better or worse,’ [one physician] emphasized. ‘What we’re doing is making this visible to doctors, to gastroenterologists and surgeons and their facilities, so they know what is happening to the patient... something they don’t know now.’

“When the data become public, it will also help physicians determine where to refer their patients.”

Confusion, conflict, complexity, and chaos? And “frightened” people?


About The Author

Davis Balestracci’s picture

Davis Balestracci

Davis Balestracci is a past chair of ASQ’s statistics division. He has synthesized W. Edwards Deming’s philosophy as Deming intended—as an approach to leadership—in the second edition of Data Sanity (Medical Group Management Association, 2015), with a foreword by Donald Berwick, M.D. Shipped free or as an ebook, Data Sanity offers a new way of thinking using a common organizational language based in process and understanding variation (data sanity), applied to everyday data and management. It also integrates Balestracci’s 20 years of studying organizational psychology into an “improvement as built in” approach as opposed to most current “quality as bolt-on” programs. Balestracci would love to wake up your conferences with his dynamic style and entertaining insights into the places where process, statistics, organizational culture, and quality meet.


You Have to Keep Score

People naturally tend to improve their scores, but people tend to want to know how many runs were scored rather than how many errors occurred.

Tracking defects, mistakes and errors and then analyzing and improving--that is the breakfast of champions.