As information becomes a core business asset with the potential to generate revenue from data-driven insights, companies must fundamentally change the way they approach data quality. (See “Seven Ways to Profit from Big Data as a Business,” BCG article, March 2014.) As with other fundamental shifts, mind-sets must be changed, not only technology.
However, companies frequently struggle to prioritize data quality issues or feel they must tackle all of their problems at once. Instead, we propose executives take the following seven steps toward total data-quality management, a time-tested approach that weighs specific new uses for data against their business benefits.
Identify the opportunities. To find new uses for data, start by asking, “What questions do we want to answer?” A systematically creative approach we call “thinking in new boxes” can help unlock new ideas by questioning everyday assumptions. Other approaches, such as examining data sources, business KPIs, and “pain points,” can be rich sources of inspiration. (See the “Big Data and Beyond” collection of articles for a sample of high-impact opportunities in a range of industries.)
After identifying a list of potential uses for data—such as determining which customers might buy additional products—prioritize the uses by weighing the benefits against the feasibility. Start with the opportunities that could have the biggest impact on the bottom line. Be sure to assess the benefits using multiple criteria, such as the value a data use can create, the new products or services that can be developed, or the regulatory issues that can be addressed. Then consider the uses for data in terms of technical, organizational, and data stewardship feasibility. Also look at how the data use fits in with the company’s existing project portfolio.
Determine the necessary types and quality of data. Effectively seizing opportunities may require multiple types of data, such as internal and third-party data, as well as multiple formats, such as structured and unstructured. Before jumping into the analysis, however, the best data scientists and business managers measure the quality of the required data along a range of dimensions, including the following:
- Validity, the degree to which the data conforms to logical criteria
- Completeness, the degree to which the data required to make decisions, calculations, or inferences is available
- Consistency, the degree to which data is the same in its definition, business rules, format, and value at any point in time
- Accuracy, the degree to which data reflects reality
- Timeliness, the degree to which data reflects the latest available information
Each dimension should be weighted according to the business benefits it delivers, as explained in the previous step. We also recommend that companies use a multitier standard for quality. For example, financial applications require high-quality data; for bundling and cross-selling applications, however, good data can be good enough.
Define clear targets for improvement. The data-assessment process provides a baseline from which to improve the quality of data. For each data source, determine the target state per data quality dimension.
A gap analysis can reveal the difference between the baseline and target state for each data source and can inform an action plan to improve each data-quality dimension. Gaps can be made visible and tracked through a dashboard that color-codes performance for each of the major dimensions of data quality. (See Exhibit 2.)
Build the business case. Data quality comes at a price. To develop an argument for better-quality business data, companies must quantify the costs—direct and indirect—of using bad data as well as the potential of using good data.
Direct costs can include, for example, additional head-count expenditures that result from inefficient processes, cleanup fees, and third-party data bills. Indirect costs can result from bad decisions, a lack of trust in the data, missed opportunities, the loss of agility in project execution, and the failure to meet regulatory requirements, among other things.
The upside of high-quality data can be significant, as the prioritization of particular uses makes clear. For example, microtargeting allows companies to reach “segments of one,” which can enable better pricing and more effective promotions, resulting in significantly improved margins. The more accurate the data, the closer an offer can come to hitting the target. For example, an Asian telecommunications operator began generating targeted offers through big-data modeling of its customers’ propensity to buy. The approach has reduced churn among its highest-value customers by 80 percent in 18 months.
With costs and benefits in hand, management can begin to build the case for changing what matters to the business. Only then can companies put in place the right controls, people, and processes.
Root out the causes of bad data. Many people think that managing data quality is simply about eliminating bad data from internal and external sources. People, processes, and technology, however, also affect the quality of data. All three may enable bad data to accumulate. For example, we have seen companies spend enormous amounts of time and money cleaning up data during the day that is overwritten at night.
Certain types of data quality issues can and must be fixed at the source, including those associated with financial information and operating metrics. To do that, companies may need to solve fundamental organizational challenges, such as a lack of incentives to do things right the first time. For example, a call center agent may have incentives to enter customer information quickly but not necessarily accurately, resulting in costly billing errors. Neither management nor the data entry personel feel what BCG senior partner Yves Morieux calls the “shadow of the future”—in this case, that entering inaccurate data negatively affects the overall customer experience. Companies can also simplify data models and minimize manual interactions with data through approaches such as “zero-touch processing.”
It may not be possible or economical to fix all data-quality issues, such as those associated with external data, at the source. In such cases, companies could employ middleware that effectively translates “bad data” into “usable data.” As an example, often the structured data in an accounts-payable system does not include sufficient detail to understand the exact commodity being purchased. Is an invoice coded “computing” for a desktop or a laptop? Work-arounds include text analytics that read the invoice text, categorize the purchase, and turn the conversion into a rule or model. The approach can be good enough for the intended uses and much more cost effective than rebuilding an entire enterprise-software data structure.
Assign a business owner to data. Data must be owned to become high quality. Companies can’t outsource this step. Someone on the business side needs to own the data, set the pace of change, and have the support of the C suite and the board of directors to resolve complex issues.
Many organizations think that if they define a new role, such as a data quality officer, their problems will be solved. A data- quality officer is a good choice for measuring and monitoring the state of data quality, but that is not all that needs to be done, which is why many companies create the position of business data owner. The person in this role ensures that data is of high quality and is used strategically throughout the organization.
Among other responsibilities, the business data owner is accountable for the overall definition of end-to-end information models. Information models include the master data, the transaction data standards, and the metadata for unstructured content. The owner focuses on business deliverables and benefits, not on technology. The business ownership of data needs to be at a level high enough to help prioritize the issue of quality and generate buy-in but close enough to the details to effect meaningful change. The transformation needed is sometimes quite fundamental to the business.
Owners must also ensure that data quality remains transparent. Companies should have a target of 100 percent across all quality dimensions for customer data, such as names and addresses, and make that data accessible through a system such as a “virtual data mart” that is distinct from the storage of lower-quality data, such as reputation scores.
Scale what works. Data quality projects often run into problems when companies expand them across the business. Too many big-data projects cherry-pick the best quality data for pilot projects. When it’s time to apply insights to areas with much higher levels of bad data, the projects flounder.
To avoid the “big program” syndrome, start small, measure the results, gain trust in effective solutions, and iterate quickly to improve on what works. But don’t lose sight of the end game: generating measurable business impact with trusted high-quality data.
Consider the journey of an international consumer-goods company that wants to become a real-time enterprise that capitalizes on high-quality data. Before the transformation began, the company had a minimal level of data governance. Data was locked in competing IT systems and platforms scattered across the organization. The company had limited real-time performance-monitoring capabilities, relying mostly on static cockpits and dashboards. It had no company-wide advanced-analytics team.
To enable a total data-quality management strategy, the CEO created a central enterprise-information management (EIM) organization, an important element of the multiyear strategy to develop data the company could trust. The company is now beginning to centralize data into a single high-quality, on-demand source using a “one touch” master-data collection process. Part of the plan is also to improve speed and decision making with a real-time cockpit of trusted customer information that is accessible to thousands of managers using a standardized set of the top 25 KPIs. And the company is launching a pilot program in advanced analytics to act as an incubator for developing big-data capabilities in its business units and creating a path to additional growth. Finally, it is creating a position for a business data owner who will be responsible for governing, designing, and improving the company’s information model.
This multiyear transformation will be entirely self-funded from improved efficiencies, such as a projected 50 percent decrease in the number of employees who touch the master data and a 20 to 40 percent decline in the number of full-time information-management staff.
Luc de Brabandere and Alan Iny, Thinking in New Boxes: A New Paradigm for Business Creativity(New York: Random House, 2013)
Yves Morieux and Peter Tollman, Six Simple Rules: How to Manage Complexity without Getting Complicated (Boston: Harvard Business Review Press, 2014).