Data quality goes mainstream
Data quality determines many critical business outcomes, including the validity of business decisions, the effectiveness of marketing campaigns, and the ability to ensure compliance, yet all too often it has been overlooked as a priority for investment of corporate resources. This is the case despite the fact that being data-driven and achieving digital transformation, two sought-after goals, depend on having high quality data. Often under the purview of the IT department or data creators, data quality has now captured the attention of C-level because of the increasing recognition of data as a strategic asset. But the path has not been smooth.
Numerous studies document the pervasiveness of poor-quality data (see below under Customer Data: How Bad is it?). In a survey of more than 1,000 professionals in seven countries, Experian found that more than 90% of respondents consider some of their customer data to be inaccurate and, on average, they believe that one-fourth of their data is inaccurate. Although 94% of companies want to use their data to optimize customer or prospect experiences, only 22% are using techniques that could be considered optimal. These techniques include centralizing responsibility for data quality, routinely monitoring it, and using a platform approach. The remainder of the companies, nearly 80%, are either unaware of their data quality, take a reactive approach, or are proactive to some degree but not sophisticated in their techniques.
The costs of failing to reach an acceptable standard for data quality are high; research conducted by Gartner indicates that organizations believe they lose approximately $10–15 million per year as a result of poor-data quality. IBM published an estimate of $1.3 trillion as the cost of poor quality data in the U.S. each year. The impact of poor data can also strike at customer satisfaction and loyalty, as well as consume time spent by employees to handle errors. Forrester found that one-third of analysts spend more than 40% of their time validating data before it can be used for decision making.
Why the neglect of data quality? Often it has been because the business case for data quality was not made clearly enough. “It is important to show how improved data will help the value chain,” said Seth Earley, CEO of Earley Information Science, a professional services firm specializing in helping firms get more value from their data content and knowledge. “Companies may have a difficult time making the connection to the core value proposition for this type of initiative. Data is only valuable when you are applying it to a problem, and stakeholders need to understand the connection. Often, the people who own the data are not the ones that feel the pain, so that connection is especially important.”
Sources of data errors are numerous and varied, but the most common one is “human error” in data entry, such as typos, entering data in the wrong field, or failing to enter a piece of data. These errors may be committed by a customer service representative in a call center, or by the customer when entering data in an online form. Other errors may emerge when duplicate or near-duplicate records are produced. In the case of product data, changes in the product may be missed, or flawed communication may lead to the owner of the data not knowing about a new product introduction. Individuals may move or change their email address, and the new information is not received.
Customer Data:How Bad is it?
One study of data quality reported in Harvard Business Review was carried out by managers participating in executive training being given by university and corporate trainers. The individuals were from companies and government agencies, and came from a variety of functional areas, including customer service and HR. They selected 10–15 critical work-related attributes from 100 work units (records), and checked each one for errors.
Nearly half the records had a least one critical error; for one-fourth of respondents, only 30% of the ?records were correct, and for half the respondents, fewer than 60% were correct. Only 3% of the records were deemed acceptable even using a low standard.
Many advertisers rely on information from third parties to target their ads, particularly online ads. Deloitte asked roughly 100 of its own employees to review information about themselves that was available through a consumer data broker’s portal in order to evaluate its accuracy. The data covered numerous variables, grouped into six categories such as demographic, economic, and purchasing history. The employees were asked to indicate whether the data for each variable was correct, and also to indicate by category whether the percent of data correct overall was zero, 25%, 50%, 75%, or 100%.
Results showed that for nearly half of the variables, the data was inaccurate for half of the respondents. Nearly 60% of respondents rated the overall accuracy of the demographic category to be 50% or less, despite the fact that information such as birth date and marital status can be obtained from a variety of reliable sources. Fewer than half said their purchase activity as listed in the portal was correct. The bottom line is that when data is being used to target customers or predict what they might want to purchase, the initiatives are on shaky ground unless effort is devoted to improving data quality.