MDM and data governance: What comes first, the chicken or the egg?
It’s now a widely accepted principle that master data management (MDM) and data governance are co-dependent disciplines. In other words, full success in one is virtually unattainable without implementing the other to a significant degree. In a very real sense, data governance is the “business function” of MDM, controlling how data is created, collected, and used. As such, it should optimally reside within the business rather than with IT to achieve its full strategic value. Unless selling data is actually one of the business functions of your organization, attempting to house data governance within the technology area exposes it as potentially just another piece of IT overhead rather than the generator of business opportunities that it should be. This can also have the unintended result of creating an additional data silo in the form of the MDM hub, or possibly integrating poor quality data, which can lead to disastrous results.
Once we accept this delineation of MDM and data governance, questions naturally follow, including how the implementation of these two interrelated disciplines should take place in an enterprise and how should they be scoped and coordinated to deliver significant business value as soon as possible? While the best practices in this area are less than intuitive and, worse, still not adopted by the majority of programs, the good news is that they serve to keep the program scopes and deliverables to a manageable level and to help ensure delivery of the goals that business and IT have agreed upon.
Even before the chicken and egg
The initial best practice in the implementation of MDM is still and always will be—the creation and ratification of a specific and measurable set of agreed-upon business outcomes that result in a clear and achievable roadmap of technical and process enablement deliverables. The correlation between the successful generation of such business cases, road maps, and transformational MDM programs simply cannot be overstated. The extrapolation of this practice that still seems to elude many enterprises is that the same principle, and in fact the same resultant artifacts can and should be used to structure and prioritize the introduction of the data governance discipline in addition to, and concurrent with, MDM.
The most common scenario that serves to aid in getting both these efforts started in the optimal way is that the more senior business stakeholders who are needed to help formulate the desired operational and/or analytical outcomes will very likely also be needed to play significant roles within the newly created virtual data governance organization. The term “virtual” is used here in the sense that, at least in the beginning, every data governance participant will very likely still have a “day job” until and unless some significant level of business value is seen to exist in having full-time data stewards. In fact, in many midsized and smaller organizations, this group remains virtual on a long-term, if not permanent, basis.
Depending on the structure of the larger enterprise, the more senior members of this group of stakeholders often end up comprising the critical mass of the data governance council or board. They may even need to function as data owners or data stewards in the early days of the live program. As such, they must be sold on assuming any of these roles as early in this process as possible. The recommended best practice here is to start these recruitment discussions as soon as is prudent during the business case and roadmap interactions, always taking organizational politics into account. If a single executive sponsor for data governance has not yet been identified, this would be the time to do so. One pitfall to avoid during this stage is to ask senior business executives, and possibly even their counterparts in IT, for their support of the data governance initiative. It is much better to start right out of the gate using the word “participation,” as too many executives interpret “support” to mean sending a forceful email out each quarter reminding subordinates of the importance of the effort. Data governance, and therefore, MDM, will require far more care and feeding than this during the formative stages to become successful, and in many cases, transformational.
Hatching the eggs (or is it the chickens?)
Once a set of outcomes has been formulated and ratified, the next step is for IT to reverse-engineer a data model that describes only the entities and the attributes required to service those outcomes or at a minimum establish the initial deliverables. Once achieved, the next step is to map the data sources for those elements to the model. The key in coordinating MDM and data governance is to logically, and physically, separate the target master data from the non-master data in the model. This is because only the master data—which tends to be state-driven, slowly changing and widely shared that repeats logically across multiple sourceswill be managed within the MDM system. Depending on the target use cases, the non-master, event-driven data such as transactions and interactions will be housed in either a third, normal form data hub for operational use cases and/or a dimensional data warehouse for analytical ones. The data governance organization will be responsible for both categories of data and quite possibly more than these two as time goes on.
Supported by their IT counterparts, it’s at this point in the process where the business stakeholders’ participation in, and ownership of, data governance begins in earnest at all levels. In most organizations, formal meetings will need to be held that have multiple objectives: resolving definitional conflicts between sources for each data element, recording the resultant technical and business definitions, lineage and mapping/transformation rules, and creating and documenting data quality rules. The conflict resolution process—which usually focuses on master data, as this data tends to have multiple physical sources—is required in cases such as different data sources either implicitly or explicitly embodying different definitions for entities, such as customer or product, which align to common master data domains. The objective is to consolidate down to the number of legitimately different definitions by identifying those that are really the same, and then adjust the data model to accommodate the real differences.
For example, a company that has historically behaved from a data management perspective as if it had a single definition for customer, discovers during the mapping analysis process that they really have six different definitions according to how the data elements in question are populated and/or used. After meeting to explore this scenario, and with the senior business stakeholders on hand to adjudicate any healthy conflict, they realize that they actually have three distinct categories of customer, not one and not six. They agree with IT to create a subtype attribute in the customer entity to store this information in the MDM data model going forward and in the resultant metadata.
Some organizations acquire dedicated data governance or data catalog technology to track metadata, while others treat metadata as an additional master data domain and manage it within their MDM system. Once all conflicts have been resolved, the remaining metadata described previously can be captured, and the data quality rules created and recorded. This concludes the initial implementation of the policy creation and management phase of data governance, as ongoing periodic meetings are scheduled to track progress and resolve any further issues as they arise. The MDM data model and system, as well as any ancillary data stores, may now be configured, loaded, and integrated with the various operational and analytical systems according to the desired business outcomes. The training of any end-user areas whose processes have changed may take place, and data stewards can be updated as appropriate as the monitoring and execution phase of data governance is materialized.
Managing all entities in the chicken coop
The reconciliation process described above can certainly be arduous in many, if not most, enterprises. However, it is an essential process that reinforces the need to limit the initial scope of both the MDM and data governance initiatives to those data elements required to deliver real business value in a timely fashion. Just as damaging as the idea that “we have to master all of our customer data” is the belief that we have to get all of our customer data under governance. By managing the scope of both efforts to the agreed deliverables, the business can more easily own and champion those governance processes to deliver ongoing value, rather than have them languish as just more IT overhead delivering minimal, or even negative, business benefit as technical debt increases. Moreover, the business can now align both processes and analytics to a data model that supports how it actually operates so they can continually adjust IT’s processes as needed to move the business forward.