Information governance: Managing complexity
Driven by increasing data volume, the need for accurate data to support business processes, and more stringent regulatory pressure from the General Data Protection Regulation (GDPR) and other privacy-based requirements, information governance is a booming market. According to Mordor Intelligence, the global market for information governance will grow by more than 20% per year over the next 5 years, reaching $3.62 billion by 2023. Organizations have been working diligently to structure their governance programs and take on the challenging task of balancing the need for formalized models with the mandate to be flexible and agile, in order to adapt to changing business circumstances.
Cisco is the leading supplier of networking technologies, with about $50 billion in annual revenue coming from sales of networking security, cloud, virtualization, routers, switches, and other networking devices. Setting an ambitious goal of providing a single source of truth for marketing, sales, and products that covered all the entities with which the company interacted, Cisco wanted to improve on its existing data warehouse by establishing a formalized modeling process.
“Our data landscape is spread across many systems that are not designed to communicate with each other, but which together compose the data truth regarding Cisco’s business,” said Jack Bieg, data modeler for the Teradata enterprise data warehouse at Cisco. Bieg’s role is primarily logical modeling, in which the subscriber requirements for business intelligence reporting come to him and he then designs the data structures that reports will consume. The output of the logical model is a physical model and dimensional model for Cisco’s Enterprise Data Warehouse.
The first step is to see if that particular requirement already exists in the data model, whether it is product information, customer information, or some other element. “We have about 5,000 entities in the model,” he continued, “with more than 37,000 attributes.” To manage the logical model, he uses ER/Studio Data Architect, part of the ER/Studio Enterprise Team Edition from IDERA Software, a modeling tool that enables users to build out a model, document attributes, definitions, and relationships, and build the foundation for a data governance program.“To develop and maintain a model of our size and complexity, you have to go through a surgical process that is extremely easy to do incorrectly,” Bieg explained. “My predecessor was extraordinary in bringing together strong analytical skills, assembling a design based on forensic clues, and then creating a single picture that is consumable by multiple sources.” The ongoing maintenance and modification of the data warehouse requires continued discipline and formality to manage new data and enforce consistency.
“ER/Studio has three main benefits for us from a logical modeling perspective,” said Bieg. “First, it provides a visual representation of our model so that anyone can see how entities are related to other entities. Second, business-meaningful entity and attribute names can be created with no character limit so they are easy to read and understand. And, lastly, business definitions at the entity and attribute level can be captured so that consumers of the information know what each attribute means in business terms.” In addition, ER/Studio allows for automating the modeling process so it does not need to be done manually. This is a timesaver when many attributes are being added or updated. Bieg has written about 80 macros for such tasks.
Bieg emphasized the importance of balancing agility with formality. “The challenge of having a formalized modeling process is the ‘formalized’ part. The idea of agile is alluring, but it is difficult to deliver with the benefits that come from a formalized process.” For example, unstructured data lakes such as those supported by Hadoop are appealing, but “unless you have someone in the room to ask questions and document what the data means, such as what an acronym represents, it is easy to build a system where data meaning resides in tribal knowledge. ER/Studio is the tool we use to capture that knowledge in a persistent way,” Bieg observed.
“Information governance is a business problem that needs to be solved through technical solutions, but to truly handle governance, organizations need a comprehensive enterprise architecture,” said Ron Huizenga, senior product manager for enterprise architecture and modeling at IDERA. “The foundation must be a robust data architecture, which supports three pillars. The central pillar is business architecture, which is flanked by application and technical architectures. This balanced enterprise architecture is essential for data governance. Integrated models and metadata provide the context of how data is used in organizations, tied to the business processes driving that usage.”
ER/Studio Enterprise Team Edition also includes Business Architect, which helps organizations model and design their business processes. “This product presents a visualized sequence of processes such as customer journeys that might start out with a customer visit to the website, order placement, and then be followed by locating the address for shipping, and so on,” noted Kim Brushaber, senior product manager for ER/Studio Business Architect. “The data modeler tool (ER/Studio Data Architect) then lays it out in a logical pattern to allow the right data to be consumed at each step.”