Managing Data Quality for More Accurate Insights
Content Marketing Manager, Sigma
Smart decisions require reliable information. You need to know what the reality of a situation is, on the ground, in real time. While there are many reasons data quality is invaluable to an organization, accurate insights are one of the most significant. Without reliable data, marketing campaigns will miss the mark, resulting in wasted ad spend. Finance teams will misjudge markets. And production will be blinded to bottlenecks and other issues until it’s too late to fix problems without major delays. Managing data quality is essential for a healthy business.
As important as data quality is, Gartner found that the average cost of poor data quality on businesses amounts to anywhere between $9.7 million and $14.2 million annually. Many challenges are contributing to the problem, including disparate data sources, ineffective cleaning processes, and inefficient validation.
But you can address these barriers. With a strategic data governance process, you can quickly see which data is inaccurate, out-of-date, or out-of-context. You know which data is vetted and identified as relevant. You have confidence that you’re seeing all the important factors relevant to decisions. In this article, we’ll look at what you need to be thinking about when it comes to data quality and how your organization can manage data quality.
Components of data quality
What defines data quality exactly? Data quality is complex, with several components. To further complicate things, data that may be considered high-quality in one situation won’t meet the standards in another. Let’s examine each of the characteristics to look for as you define data quality in specific situations.
Accurate — Accuracy refers to how factual or correct the data is. The importance of accuracy is obvious, but let’s look at an example to see just how significant it is. If you’re managing an ABM campaign and you’re targeting companies that fit a specific ideal client profile, you must have an accurate list of companies and people in those companies that meet the profile. ABM campaigns can be expensive. Without accurate data, you may be targeting accounts that aren’t ideal prospects — wasting a portion of your quarterly budget and missing your targets.
Complete — If your data is complete, then you can be sure you have all the essential information to take effective action. If information is incomplete, on the other hand, you’re missing one or more pieces of the picture, giving you a fragmentary view. The old parable of the blind man and the elephant is apropos here — if all you can see is the tail of an issue that is, in fact, a full elephant, you’re going to misjudge the nature of that issue.
Relevant — This is where the definition of data quality gets murky. Data must be related to the question you’re seeking to answer to be considered high-quality. Perfectly-good data in one situation may be completely irrelevant in another, making it unactionable. For example, if you’re trying to determine why last quarter’s marketing spend didn’t deliver the results you expected, but you’re looking only at attribution reports, you may miss the significance of a black swan event on the industry you were targeting that quarter. The importance of this component becomes clear when you consider the correlation/causality conundrum: data may be correlated and seem relevant but actually, be unrelated. Irrelevant data can lead you astray.
Clean — Clean data is both valid and unique. If data is clean, it follows proper syntax (formatted correctly) for usage, and it is unique across all systems (there aren’t multiple, different versions of the data in various locations).
Up-to-date — You can’t stress the importance of fresh data. If your data is outdated, you’ll be working from a faulty foundation. Exactly how up-to-date your data must be will vary based on your purpose. In many cases, you’ll want data to be real-time, but in others, a wider window is fine.
If data doesn’t match each of these components, it is compromised. Low-quality data is not valuable and can even be dangerous to an organization.
How to assure data quality
To be sure that the data you’re working with is high-quality, there are several steps to take. These steps require effort, but once you have systems and processes in place, you’ll be able to assure data quality more efficiently.
Get buy-in from all departments
First, you’ll be more successful in your efforts to manage data quality if you have buy-in from all stakeholders within the company. You need stakeholders on board to get the resources you need and ensure cooperation. To do this, build a business case for data quality. Detail each way the business stands to benefit from better data — and the risks involved in continuing to operate according to the status quo. Next, build the case for how data quality will help each department. The closer home you can bring these benefits, the easier it will be for people to see the value.
Define KPIs to describe what constitutes quality
Focusing on the five components of data quality, identify the KPIs that will allow you to know if your data meets your target standard. Particularly think through what level you need to reach for each component — for example, how up-to-date does the data need to be, and how will you evaluate relevance? Finally, decide how you’ll track these KPIs. (For more tips on data quality, see this guide.)
Include data quality activities in your data governance framework
Your data governance framework is a powerful tool that will help you in your quest for data quality, as it will outline detailed processes to follow. Include a plan for each of the following activities in your data governance framework:
- New data acquisition
- Data integration
- Source trustworthiness
- Error identification
- Error correction
Keep a data quality issue log
A data quality issue log will allow you to identify the precise nature of each problem, where the problem occurred, and why the issue was able to make it through your preventative measures. Keeping a log gives you a concrete way to track problems and solve them. Your log should include the following for each issue:
- Specific details of the problem, including context, where and when the issue occurred, and its impact
- Resolution team, including the owner who has sign-off, an analyst who can help guide the process, and an IT team member who can handle technical details
- Details of the resolution, including the cause of the issue, the solution, and how to prevent the problem from happening again
(For a deeper dive into additional helpful items to include in your log, see this post.)
Look to solve quality issues at the data onboarding point
A final important point: when evaluating data quality issues and looking for causes and solutions, be sure to go all the way upstream to the data onboarding point. Often, quality issues will manifest downstream, but root causes lie at the onboarding point. Don’t forget to be thorough in your detective work to uncover root causes.
Managing data quality is a process
It’s important to remember that managing data quality is a process, not a one-time project. As you implement the steps above and work to track and solve data quality issues regularly, your efforts will translate into accurate insights that result in better decisions across the company.