-->

KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

Managing Unstructured Information for Control, Compliance and Cost

Enterprises struggling to reduce the impact of spiraling storage management costs or comply with government regulation are looking for solutions to address the challenge of legacy data—identifying and analyzing electronic information that exists in file shares (shared network drives,) SharePoint and Exchange. Because this problem affects nearly every organization, savvy IT and compliance managers need to work together to implement and enforce comprehensive information governance strategies that focus on identifying and applying policy to information for control, compliance and cost reduction benefits. In this article, we explore opportunities and benefits of legacy data clean-up for business and compliance purposes.

Today's Reality

Most organizations today use numerous repositories to store information across the enterprise. The unprecedented growth in data volumes and formats makes it increasingly more difficult and expensive to back-up, discover, retrieve and reuse trusted information. The business value of unstructured data is reduced, creating greater exposure and risk to the organization.

Legacy data clean-up enables you to better manage the information footprint and associated storage management costs through defensible destruction of redundant, obsolete and trivial data. You can identify and determine its status as sensitive or secure to minimize your risk of accidental misuse and leakage. Legacy data clean-up allows you to leverage business opportunities and limit business risk. It also allows you to reduce information footprint, decrease back-up costs and lower expenses associated with long-term storage, including the cloud.

But what is legacy data? Legacy data is human readable, unstructured data that does not have a policy assigned to it and is therefore unmanaged. Legacy data tends to be inactive and orphaned. A significant amount of legacy data contained in file shares, SharePoint and Exchange can be considered redundant, obsolete or trivial and of little value to the business.

Unmanaged legacy data presents significant risk to an organization. Scenarios include:

  • Containing sensitive information that is unprotected from data leakage and misuse. For example, a copy of a confidential business plan is sent to a business partner;
  • Using information out of context. For example, a non-lawyer references a statute that is not applicable to the situation;
  • Making decisions based on outdated versions of information, e.g. a maintenance worker selects an out-of-date manual to support a repair project; and
  • Duplicating effort is spent in producing the same information again, e.g. a colleague resends information because the recipient cannot locate data known to exist.

Legacy data clean-up can help you to identify and analyze the unstructured information contained within file shares and system repositories so you can take the appropriate actions to manage and maintain it throughout its lifecycle. Not all information is created equal and once identified, it should be managed according to its business value.

The Stages of Legacy Data Clean-up

Organizations that do not understand what data they possess or its value to the business cannot manage this data in an efficient manner, nor can they successfully implement policies that manage information for control, cost and compliance across the enterprise. In most cases, legacy data clean-up is the starting point for implementing a broader information governance strategy within your organization. Legacy data clean-up consists of five stages starting with the identification of information, taking appropriate action, and ultimately finishing with a complete clean-up of legacy data, including destruction and migration to appropriate repositories.

Identify and index. The first step in the legacy data clean-up process is identifying data sources and indexing the data to get a clear understanding of what exists and where. The most common repositories for legacy data are file systems such as SharePoint, shared drives and Microsoft Exchange.

A robust legacy data clean-up solution enables you to index these repositories and others at the metadata or content level. Indexing metadata only (a light index) provides a fast and efficient way to identify much of the redundant, obsolete and trivial data that may be ready for defensible destruction. For example, the index provides insight into how the data has aged, and when it was created and last modified to give you a picture of the data's business relevance.

A deeper analysis of the content, including advanced data analytics, will allow you to identify what the data is, its sensitivity and security needs, and whether it holds greater business value. In some cases, the information may satisfy the criteria of a business record and should be managed accordingly.

Analyze. A legacy data clean-up solution provides a user-friendly, graphical dashboard that highlights statistical summaries and categorization. The analysis shows breakdowns by data types and categories across the enterprise, or in specific repositories. Redundant data is determined and presented by duplicate statistics; obsolete data is presented based on policy and creation, modified, or access dates; and trivial data is determined based on file types that have no content value. The dashboard also provides summaries of the volume of data being created over time, giving you insight into a repository's lifecycle.

Organize. Once legacy data is identified and analyzed, you can organize the data by business value, context and relevance. Organizing the data requires an understanding of business requirements, policies, information categories and classification categories. Based on the indexed data, a comprehensive legacy data clean-up solution can identify information clusters with common content patterns and groupings that form the basis of categories. If you have already established data categories, you can compare these against the categories that have been uncovered through the data identification and analysis processes.

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues