Features and functionality offered by SharePoint 2010 for implementation of governance require an in depth understanding of organizational content. The first step is in development of a content model, which becomes the blueprint for the overall content architecture.
What does architecture have to do with governance? Everything. We need to understand content life cycles in order to assign review and approval processes. We need to differentiate various types of information to understand how that information will be disposed of or archived. Ownership of information may be organized around things like content type (in the context of a particular site or library within a site). Conducting an audit of the existing environment is an important initial activity that helps identify not only the types of content that exist, but also unique characteristics inherent to the information itself.
The scope of the content audit is to review and evaluate a representative sampling of documents with the result typically captured in spreadsheet format, usually containing columns for the following information:
- Document type—Identifies the general type of content represented by the document. Generally speaking, this document is a <blank>, where <blank> is a policy, procedure, FAQ, job aid, bulletin or notice, etc.
- Volume of content—Estimation of the total amount of content for this document type. Intended to provide a general scale of overall scope.
- Special characteristics—Notation of qualities that stand out, such as format (does it make use of a standard template or use consistent naming conventions?); metadata attributes (how is it tagged—status, owner, topics, language, sub-types, audience, reference data, etc.?); are there multiple versions and is it still relevant (when was it created or last modified)?
- Relative value—Identifies the importance of the information to the organization overall. Not all content is created equal, and those types identified as having higher value will require special methods for handling areas such as enrichment and life cycle.
- Source—Location in the source repository where the document is stored.
- Samples—Listing of sample titles of documents typically representing the type of document.
Designing content types
The question is what does the content analysis have to do with governance? This phase of the information architecture process helps identify important information types that can then be managed within SharePoint through the application of content types. A content type in SharePoint is defined as a reusable collection of settings that describe behavior, management and properties for a specific type of information including metadata attributes, information management policies, workflow and standard templates.
While many content types are provided out of the box, they are generic in nature and in no way representative of the uniqueness of organizational information. The purpose of the audit process is to help identify an initial set of document types required to be considered for implementation as content types. Approaching the management of content using content type definitions often begins with the design of a foundational type with a core set of standard metadata attributes required to be applied to all content items. The core content type becomes the parent for all other content types that then inherit the basic properties defined. Additional traits for children and grandchildren, etc., are added to describe the unique characteristics of each additional content type that is required. Once an initial set of content types is identified, further design detail around each is required, including:
- Metadata schema—Identifies unique attributes used to describe the inherent nature of the type of content as well as requirements for enrichment intended to improve findability through access mechanisms like search and navigation.
- Taxonomy and term store management—Identifies facets and controlled vocabularies for consistent application of terminology across the SharePoint solution, regardless of location in the hierarchy.
- Information life cycle design—Consists of modeling processes around creation, capture, management, retention, archival and/or disposition. Includes defining information management policies and standard processes for periodic review to ensure accuracy, quality and relevancy of content. Design activities are closely aligned with the development of workflow, retention and compliance mechanisms.
Creation, however, can get out of hand, and, therefore, instantiation of new content types cannot be allowed to be an ad hoc activity. Governance processes around management of content types need to be established and must include policies and procedures for creation, modification and deletion.
Additional considerations for design include scope of use. Some content types may be required across departments, geographic regions or lines of business, while others may be localized within a specific business unit. Those identified as global require centralized management within a specialized site collection known as a content type hub. They can then be made available for use in subscribing site collections through syndication.
As mentioned before, a common approach to designing a global metadata schema begins with identification of a core set of attributes required to be applied to all content items in the SharePoint environment. Those attributes are attached to a core content type as columns and inherited by each individual content type as a base set. Unique metadata requirements are then layered on top of the core to support both management and enrichment by describing the inherent nature of the type of content. A formal process for defining what the additional attributes are begins with the audit and ends with a schema representative of each content type. Management of metadata attributes for any particular content type must then be subject to formal governance procedures around creation, modification and/or deletion.
Taxonomy, controlled vocabulary and term store management
Taxonomy within the SharePoint environment is typically applied to content through the application of controlled vocabulary. In SharePoint 2010, that is surfaced via managed metadata, which represents a hierarchical collection of predefined and centrally managed terms that are applied by publishers as metadata attributes to content items. Terminology surfaced as part of the publishing process originates from within the term store, which provides centralized storage and management for standard vocabularies through the following constructs:
- Groups—A flat list or hierarchical collection of related attributes comprised of one or more term sets.
- Term set—A flat list or hierarchical collection of related terms that belong to a group.
- Term—A word or phrase that can be applied by publishers and system users as metadata to content.
Those constructs must themselves be designed to be flexible enough to evolve with the business over time. The evolution, however, must be both predictable and controllable. Not anyone can be allowed to make changes on a whim to the term store structure because all managed metadata fields throughout the solution use it as the source for controlled vocabulary. Term store administration must be considered within the realm of information governance and include change control that addresses formal review processes important for quality assurance and consistency, standard practices for adaptability to changes in the business environment and an evaluation of costs and benefits of proposed modifications including the impact on the existing environment in terms of retagging of content and retraining for publishers and consumers. This chart http://www.kmworld.com/downloads/73944/KMW_2_2011_chart1.pdf provides an overview of standard roles, along with tasks that each is able to perform.