How to Manage Big Data with a Data Governance Policy
Companies, both large and small, learning to cope with big data are using a myriad of strategies from employing private and public cloud enabled technology to completely overhauling their view on support of data types, device types, and overall data strategy. One critical approach to managing the big data challenge is the creation of a data governance policy. While data governance has been used by IT for years to establish control over organizations' numerous types of data, big data has unique characteristics that affect how it is governed. For example:
- The sheer volume of data. Big data tends to grow exponentially. Without defined governance policies in place, it quickly can become impossible for organizations to search, classify, and manage huge amounts of information.
- The wide variety of data. Traditionally, data governance has focused on information stored in relational databases. Big data, however, involves many different forms of information such as non-relational databases and other types of unstructured data, like information generated by social media.
Combining knowledge of big data challenges with a well-thought out data governance policy, however, enables organizations to increase the value of their information and transform it into a highly available view of a company's legacy knowledge and intellectual property. When users enjoy access to data that is consistent, accurate and available when they need it, that translates into better business decision-making and faster responses to market trends and customer needs.
Neglecting Data Governance - A Risky Proposition
It's not uncommon for organizations to view data creation as a separate activity from data governance. That approach can be risky, however, if data governance falls by the wayside and is forgotten or neglected. Dangers associated with ungoverned big data include:
- Data that is difficult to search and analyze. When organizations don't have a coordinated approach to data management and governance, business users often have a hard time finding the information they need to make decisions and they may not trust the validity of the data they can access. This problem is compounded as data volumes increase. On average a company's data doubles every 18 months. This means that the time for getting control over this information is now.
- Lack of compliance with regulations or internal controls. Regulations such as the Federal Rules of Civil Procedure (FRCP), the Federal Rules of Evidence (FRE), the Health Insurance Portability and Accountability Act (HIPAA), Sarbanes-Oxley (SOX), and others are complex because organizations must proactively demonstrate compliance with standards related to electronically stored information. If steps to ensure compliance are not articulated in a data governance policy and then followed, compliance issues can arise.
- Potential for financial and reputational damages. If organizations don't have a clear idea of the lifecycle for different types of information and the systems where information resides, the risk of data breaches and theft increases. That can result in fines, as well as reputational damage.Unclear retention policies for different types of information. Storing big data indefinitely can quickly become a costly matter. A robust data governance policy should define how long different types of data are retained.
Harnessing Big Data's Potential Through Data Governance
Big data, when used wisely, can deliver tremendous value to organizations. The importance of data governance in this equation is gaining visibility. A recent report from the Institute for Health Technology Transformation, for example, indicated that a standardized format for data governance is essential for healthcare organizations to leverage the power of big data.1 The authors indicate that the first and most critical priority is to develop a carefully structured framework for enterprise data governance.
Whether you are developing a policy from scratch or enhancing an existing one, here are four ways to strengthen your data governance model:
1. Develop a data governance strategy. This should be consistent with the overarching business strategy and should include guiding principles for how big data will be governed. This means deciding who owns different types of information, who can access it, and how data is used. Key issues to consider include data quality, regulatory requirements, security and privacy, and information lifecycle management.
2. Use a cross-functional approach. This is particularly important for compliance purposes. Data and information systems often touch many different departments and no one individual has a complete view. A cross-functional team is best positioned to develop a holistic view of the organization's big data, including controls, documentation and auditable proof of compliance.
3. Make decisions about data-related end-of-life issues. All aspects of the data lifecycle are relevant when addressing data governance, but end-of-life issues shouldn't be overlooked. One standard retention schedule won't fit all needs. Different types of data will have different requirements for retention periods. Organizations may elect to archive data in order to enhance application performance.
4. Consider how technology can support data governance efforts. With big data, organizations must estimate how quickly data volumes will grow, as well as how costly it will be to store information. Data governance policies should define when information is moved to archiving systems which offer less expensive forms of storage, while maintaining easy access for end users and taking performance loads off of other applications.
Big data has great potential for helping organizations do business better, but simply having the data is not enough. To derive the greatest value and minimize risks, a data governance infrastructure is essential.