



© 2023 Quality Digest. Copyright on content held by Quality Digest or by individual authors. Contact Quality Digest for reprint information.
“Quality Digest" is a trademark owned by Quality Circle Institute, Inc.
Published: 04/04/2023
Continuing our thinking about ways for data leaders to save money during a recession, this article drills into saving on your data usage. Following my last post reminiscing on the lessons I learned during past recessions, the early environmentalist slogan “reduce, reuse, recycle” has stayed in my mind. Beyond the workload and team thinking, I’ve begun to muse about how those approaches could apply to data.
Such reflection also caused me to recognize that some of the abundance of this century has perhaps become a pitfall. What I mean is the growth in available computing power and increase in the affordability of massive data storage. Now, please don’t think I’m a Luddite; I don’t regret either of these technological breakthroughs. But since scarcity is often the mother of invention, the counterpoint must surely be that an abundance of supply can lead to lazy thinking or stasis.
How could we think differently about our use of data?
To aid us in avoiding such a pitfall, and to potentially save money in the process, I recommend noticing a number of converging forces. Organizations and their leaders need to concern themselves with a number of expectations and constraints. Let me call out just three.
First, environmental concerns with regard to energy consumption (including data centers and thus the scale of data storage). Second, ethical and data-protection concerns and regulations to curb the inappropriate use of especially personal data. Third, economic realities, including reduced budget and the organizational search for ways to cut costs without reducing quality or output.
Rather than becoming depressed by the weight of expectations on today’s data leaders, I recommend seeing these as a classic Venn diagram. In other words, I believe they are overlapping sets of interests. This suggests that data leaders should aim for the proverbial “sweet spot” of addressing all three at once.
Let me lay out my initial thoughts on this: They are far from complete or wise, but I hope they point the way to better brains thinking their way to a better solution. I’ll apply the phrase above as a framework to share what I believe could be three ways to address the three different concerns identified above.
1. Reduce your use of data
We are so far down the yellow brick road on the way to Big Data City that reducing data use sounds like heresy, doesn’t it? Although I feel I’m in good company; even David Spiegelhalter, in his book The Art of Statistics (Basic Books, 2019), shares ways that statistical thinking was more robust when analysts had to sample data. I suggest that we’ve slipped into mindless storage of everything, just in case it helps.
Who knows how much energy is being wasted (and carbon released) by data server farms storing data that will never be used? Plus, I’m certainly aware that too many organizations have “survived” general data protection regulation (GDPR) rather than radically changing the way they think about what data are really needed to meet their customers’ needs. This must be an opportunity to save money, if only on the growing costs of cloud provider subscriptions.
Although this will not be welcomed by many, I recommend going back to basics in the same way that I have seen work well for business intelligence (BI). Most organizations these days have too many dashboards and reports. Most aren’t being used to drive decisions and actions. So, BI leaders have learned that periodic pruning is needed. Stop the automatic issuing of dashboards and see who complains. Such thinking could also be used to delete data that aren’t being used. Challenge analysts to either finally get around to their planned analysis/model building or lose the data being held “just in case.” Then calculate the cost savings.
2. Reuse your data and datasets to meet other needs
One of the useful insights in Bill Schmarzo’s book The Economics of Data, Analytics and Digital Transformation (Pakt Publishing, 2020) was to watch out for opportunities for reuse. What he means by this tip is spotting opportunities to reuse data or analytics.
I was recently interviewing a data leader for my podcast when he mentioned the need to think outside silos. He identified risks of myopic thinking in terms of role, function, business, sector, and geography. Analysts and data scientists need to be mindful of creating value by maximizing reuse of data and existing analytics. Identifying transferable processes, customer understanding, models, transformations, and insights could help with other business challenges.
A good discipline here can be to develop a common modeling dataset drawing from the wider pool of data still held in a data lake or a similar approach. Develop a sort of corporate memory by starting with a narrow, high-quality dataset and only adding to it as variables prove to add value, but also include those that prove important when explaining context or identifying implications. As well as it being less costly to optimize performance for this smaller dataset, it also helps prompt analysts with what to consider using.
3. Recycle your data by applying the scientific method
A number of my guests on the Customer Insight Leader podcast have come from a science background—not just science degrees or Ph.D.s, but in some cases successful careers in academia. When you talk with such data leaders, you begin to spot some of the rigor that they can bring to data science functions. Consider the call for a workflow that produces repeatable results in Enda Ridge’s helpfully practical book Guerrilla Analytics (Morgan Kaufmann, 2014).
Within academia, at least for those focused on published, peer-reviewed research, there can be a greater focus on scientific method—that is, a methodology with a robust feedback loop. It's not just deploying a model and then moving on to the next project, but ensuring accurate data capture of the effect and outcome of such an intervention as well as its continued effectiveness.
These concerns are most often raised with the goal of improved statistical robustness. But there is also a cost-saving angle. Implementing effective feedback loops, coupled with techniques like retrospectives, will improve future assumptions, analytics, and models, often reducing the time taken to investigate from scratch. Continual monitoring of the performance of all deployed models and data products can also automatically tell when they need rebuilding. This too offers cost savings, both in terms of avoiding unnoticed reduction in benefits and avoiding unnecessary rebuilds of things that are still working.
I’m painfully aware that I share the above as someone who isn’t as involved these days. I talk regularly with my client and friends, but that isn’t the same as leading a data or analytics function. So, I’m not going to pretend that my thinking can be implemented as outlined above. I recognize that teams, processes, organizations—and, indeed, the world—are not that simple.
But I have explained my thought process well enough to inspire yours. You are close to the action. You have your own data teams, or are a specialist working within one. What would you recommend? What are the pragmatic opportunities to save money here (while also being more sustainable and ethical in your data usage)? I look forward to hearing about your brilliant ideas. It would also be great to hear how much money you save your organization by implementing them.
First published March 2, 2023, on CustomerInsights.
Links:
[1] https://www.customerinsightleader.com/opinion/how-data-leaders-can-cut-costs-without-reducing-value/
[3] https://www.customerinsightleader.com/others/is-there-a-big-deal-in-your-big-data/
[4] https://www.customerinsightleader.com/others/innovating-in-lockdown/
[5] https://www.customerinsightleader.com/others/how-to-avoid-stakeholder-disappointment/
[6] https://www.customerinsightleader.com/books/the-goal-story-constraints/
[7] https://www.customerinsightleader.com/others/ai-ethics-1/
[8] https://www.customerinsightleader.com/others/reaction-to-gdpr-data-culture-research/
[9] https://www.amazon.com/Art-Statistics-How-Learn-Data/dp/1541618513
[10] https://www.customerinsightleader.com/events/gdpr-requires-data-quality/
[11] https://www.customerinsightleader.com/others/shiny-object-syndrome/
[12] https://www.amazon.com/Economics-Data-Analytics-Digital-Transformation/dp/1800561415
[13] https://www.customerinsightleader.com/books/gestalt-coaching/
[14] https://www.customerinsightleader.com/opinion/reinventing-the-wheel-analytics/
[15] https://www.customerinsightleader.com/the-customer-insight-leader-podcast/
[16] https://www.amazon.com/Guerrilla-Analytics-Practical-Approach-Working/dp/0128002182
[17] https://www.customerinsightleader.com/opinion/3-mistakes-analysts-make-when-delivering-their-work/
[18] https://www.customerinsightleader.com/books/resources-to-help-you-develop-your-statistics/
[19] https://www.customerinsightleader.com/others/non-data-scientists-lead-data-science-teams/
[20] https://www.customerinsightleader.com/others/how-to-measure-the-roi-of-your-analytics-team-part-1-profit/
[21] https://www.customerinsightleader.com/opinion/data-science-teams-more-methodical-1/
[22] https://www.customerinsightleader.com/opinion/more-specialists/
[23] https://www.customerinsightleader.com/opinion/more-thinking-to-save-money-reduce-reuse-and-recycle-your-data/