-->

KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

Rising to the Real-World Challenges of ECM

Much has been written about the promise of enterprise content management (ECM). Initial efforts to implement ECM systems already have demonstrated the potential for achieving significant benefits. Organizations can achieve competitive advantage by making faster, more informed decisions. They are recognizing the potential for improving processes and increasing efficiency, and many companies are looking to ECM to help them better comply with Sarbanes-Oxley regulations. As with any area of strategic promise, however, ECM was bound to experience growing pains as implementations met with certain tactical real-world limitations. As noted by KMWorld editorial director Andy Moore in the 2004 edition of this white paper, ECM is at an awkward stage. Much of this awkwardness stems from the sheer complexity of managing content in modern, large-scale organizations with complex operations. In general, the problem stems from the fact that ECM is rising out of the convergence of markets for tools designed around simplified world models. These tools include electronic document management (EDM), records management (RM) and business process management (BPM) technologies—each of which limit ECM's ability to address the variety of content encountered in the enterprise environment. The good news is that today's ECM technologies are overcoming these limitations. While new security and administration capabilities address the "enterprise" portion of the ECM equation, exciting developments are improving the "content management" side as well. Organizations now have an opportunity to further improve the returns on their ECM efforts, thanks to content management capabilities that address key real-world challenges:

Challenge #1: Content Comes in a Wide Variety of Forms from Multiple Sources

One of the most important issues that must be addressed in the next wave of ECM implementations is that of flexibility. Because of their genesis, current ECM systems tend to be based on relatively simplistic world models. For example, there may be an underlying assumption that all content is managed internally to an organization, from creation to destruction. Although this can be a workable assumption in some cases, it does not represent the actual situation in most large organizations. Managing internally generated material as "content" and externally generated material as "something else" is a non-starter in many situations. Content that arrives from outside often is both high-volume and high-importance, and comes in a wide variety of forms: letters, e-mails, invoices, surveys, etc. The rigid formatting, version control and even vocabulary control expected by many ECM systems is completely inappropriate for such material. Attempting to pigeonhole information into rigid, predefined structures can be a major impediment to companies' ability to respond to change.

Efficient processing of these kinds of content requires tools that automatically examine each item and make accurate decisions regarding subject matter, emphasis and even sentiment. In the real world, businesses need to continually deal with information on a content-driven basis, not according to preconceived notions of how that information should be organized. Fortunately, powerful and proven technologies are now enabling much greater flexibility in the next generation of ECM systems. These tools can quickly determine the conceptual content of arbitrary documents with an accuracy rivaling that of humans. They are even capable of determining sentiment—Does a given e-mail reflect a customer that is mildly displeased or one that is ready to sue? These tools can provide an answer.

Recent advances allow information to be dealt with in a totally concept-driven fashion. It is possible today to take arbitrary input such as letters, e-mails and survey responses and, in a completely automated process, detect major themes and sub-themes and organize them accordingly. Automatically generated understandable labels can be applied to these themes and sub-themes. Such on-the fly taxonomies can be generated at the rate of millions of documents per day. This frees organizations from the myriad limitations placed on them by rigid, pre-defined structures such as static taxonomies.

Challenge #2: Content Contains Errors that Can Hinder the Effectiveness of ECM

One of the most common shortcomings of many new information-processing technologies is that they ignore the effects of errors in the material being processed. In the world of ad-hoc and externally generated content, errors are a significant factor. For some ECM systems, a significant portion of content is processed through optical character recognition (OCR) equipment. Unfortunately, OCR technology generates a non-trivial number of errors and correcting such errors is very labor-intensive. E-mail also is replete with errors, including misspellings and text that is ungrammatical and often not in the form of complete sentences. Existing ECM systems tend to rely heavily on keyword-based text processing capabilities that are adversely affected by spelling errors. Although vendors incorporate workarounds such as wildcards and fuzzy searches, these produce varying degrees of irrelevant information in affected business processes. Systems reliant on linguistic processing can yield erroneous results when dealing with ungrammatical source material. Today, tools exist that can deal effectively with content that contains significant amounts of errors in spelling and grammar. Content containing misspelled words can be dealt with in a manner that does not generate extraneous results. Content can be processed directly at a conceptual level, without requiring laborious and error-prone linguistic processing.

Challenge #3: Content Comes in Many Different Languages

The United States only accounts for approximately one-third of the world economy, and less than one-third of Internet users worldwide speak English as a native language. For many businesses, the ability to deal with information in multiple languages is a necessity. While many ECM vendors have incorporated capabilities for multilingual document processing, multilingual processing (i.e., the ability to process content in more than one language) in and of itself is not sufficient for most global organizations. What really is needed is cross-lingual processing: the ability for a user working in one language to automatically carry out search, categorization and analysis activities involving content in other languages. There have been attempts to address this need using machine translation software. However, the state-of-the-art in machine translation is simply not adequate to support most business objectives. A new class of cross-lingual information processing tools enables much-improved multilingual content applications. For example, users can create a query in one language that can search content in multiple languages without requiring translation of either the queries or all of the content. They can create exemplars in one language that can be used to accurately categorize content in other languages. They can immediately understand the context of an unusual foreign word or acronym and can organize and prioritize multilingual content in order to re-purpose it. The tools that make cross-lingual information processing possible employ conceptual representations of content that work directly in the native languages. This avoids the inaccuracies and inconsistencies induced by contemporary machine-translation systems.

Challenge #4: Organizations Must Balance Security with the Need to Share Information

Although most discussions of security for content center around access to particular items, access control is only one part of the information sharing story. Legitimate considerations of privacy, legal, regulatory or national security issues often preclude providing some users with direct access to specific content. Implementing only an access-control scheme will limit organizational effectiveness in such environments. Recent mechanisms of information sharing exploit the ability to automatically determine the conceptual nature of content and to holistically extract relationship information from collections of documents. These capabilities allow the use of restricted content in new and exciting ways. For example, restricted content can be used in a background mode to greatly improve user efficiency and satisfaction in dealing with non-restricted content. This is a win-win situation: the restricted content is completely protected while, at the same time, the implicit and high-order relationship information contained in this content can be leveraged in business processes.

Challenge #5: Users Do Not Have Time to Organize Content or Create Vocabularies

Perhaps the most important metric for evaluating an information system is user acceptance. The nature of the user experience is critical to the future of ECM systems. Users do not want to carry out time-consuming tasks to overcome deficiencies in the underlying world models of ECM systems, and they do not want to deal with rigid structures defined by others. Subject-matter experts do not want to craft and maintain taxonomies, ontologies and other auxiliary structures needed to propel system performance. No one wants to use controlled vocabularies in order to exchange information. Most users have neither the time nor the inclination to manually create elaborate metadata headers and complex markup for content. Thankfully, the need for users to spend large amounts of their time accommodating software limitations is rapidly passing. Software capable of dealing with content on a conceptual basis eliminates the need for excessive manual intervention in the most irksome of activities. Such software allows a much higher degree of automation of routine content-processing functions. Users can then focus on value-added activities that exploit the unique pattern recognition, abstraction and planning capabilities of the human mind.

Mature Solutions Are Now Delivering on the Promise of ECM

As noted by Forrester Research VP Connie Moore, the "administrative" aspects of content management are being assimilated onto operating infrastructures. The basic functions of access, administration and security are increasingly being addressed within operating systems and storage architectures. This will greatly facilitate the advent of a new generation of ECM implementations capable of addressing the problems of modern enterprises. Combined with the advent of concept-based tools that account for the complexities of content management, the situation is extremely promising. The pioneers in the field have established the basic business case for ECM. Incorporation of modern tools will allow implementation of ECM systems capable of addressing real-world problems in even the largest organizations. The highly developed state of such tools will speed the progression of ECM to an elegant, mature solution that plays a crucial role in enabling key processes across the enterprise.


Content Analyst technology delivers immediate value anywhere people are required to find relevant and actionable data within massive amounts of unstructured information. Based on a highly refined and extended application of our patented Latent Semantic Indexing (LSI) technology, Content Analyst technology easily integrates into current architectures without replacing existing tools, giving organizations the ability to organize, access and share information across multiple languages without the need for extensive human intervention. To learn more, visit Content Analyst Company, LLC or contact us at info@contentanalyst.com or 1-800-863-0156.

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues