-->

KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

Compliance and Content Governance

Two major incidents coincided in the last several years that fundamentally changed the value we put into the word "protect." The first was 9/11, the implication being that our need to understand patterns in seemingly random information and then turn them into concrete decisions and actionable operations must improve dramatically. The second incident was the series of corporate scandals (e.g., Enron), implying the need for the corporate entity to gain better control of its operational governance and transparency to protect its assets, reputation and intellectual property. The phrase "an ounce of prevention is worth a pound of cure" rings loudly in this context. While it is certainly valuable to extract meaning and actionable responses from complex and broad requests for information, the real value comes when the system becomes predictive. To protect is to prevent.

Certainly if financial investment is any indicator, the effort put into solving these problems has major focus. A rough estimate of the regulatory and compliance market puts it at around $6 billion in 2006. The Enterprise Storage Group (ESG) conducted a study and found more than 10,000 laws and regulations in the United States alone, drafted by federal and state legislative bodies, affect business. Interestingly, these regulations directly address management of information including "records" which can be transactions, documents, files, videos, images and sound recordings—almost any form of information possible. Another study noted that nearly 90% of US corporations are engaged in some type of litigation, costing an average of $8 million annually. Furthermore, the average company balances 37 lawsuits at any given time; the average $1 billion company faces 147 (Business Wire, Oct. 10, 2005, Fulbright & Jaworski LLP).

Over the last few years, since 9/11, the process of maintaining regulatory compliance has grown exponentially. It began with the basic need to retain information long enough to meet legal requirements. While a necessary first step, without intelligent access to the information it is rather limiting. Regulatory compliance today focuses on the intelligence of the retrieval platform; the idea being that it is better to know the answers to the test before the test rather than after. It is no longer acceptable to implement a static "process documentation" effort or "archive everything" strategy. A more comprehensive dynamic solution is required to monitor and track relevant information and provide alerts to the proper management for action. The goal is minimizing the lag time from the moment the incident is identified to the point where it can be corrected or isolated. Naturally, if you cross over to a predictive model, then the lag disappears because you are theoretically catching it before it occurs.

Enforcing the Policies

Today's largest corporations employ hundreds of thousands of associates, thus creating two policy challenges: Firstly, firms need to keep their associates abreast of changing corporate policies; secondly, the firm needs to ensure that these policies are being adhered to. The key enabling technology here is an intelligent information retrieval platform. It provides an effective medium for getting the right policy information into the right hands quickly. Workers can identify best practice information, documentation, forms and corporate handbook information to dynamically empower correct and timely policy compliance. Enterprises can also track and monitor corporate activity to ensure workforce compliance.

The issue boils down to operational transparency, but this does not mean you simply make all your information available to everyone. What it means instead is full control of all access, ensuring that it is accurate and relevant and that it is also going to the right person at the right time. These are policies that should direct the security of your information retrieval platform. Furthermore, the context of the information is also important. The same request for information may not produce the same results if requested from two different departments. Context is very important and should be exploited at every opportunity.

Finally, the demand to move from a reactive to a proactive response is increasing as litigation costs continue to rise. Reactive response to a crisis implies it must happen first. The term "forensics" is a good one because it generally implies that you wait until the body dies before starting your investigation. A proactive response is one that attempts to predict an incident before it happens—preventive medicine, if you will. The implication here is that your information retrieval platform must include analytical intelligence to ferret out patterns and exploit them to develop a repository of rules that trigger alarms presaging a possible risk.

Where does this intelligence come from? The most direct answer is from the experts. They have sufficient anecdotal experience to establish logical patterns (if this happens, then this, then this). The information retrieval system provides them the means to create the patterns and apply them to the broadest possible set of content. But the information that hurts the most is the information "I don't know that I don't know." For this, we rely on the intelligence of the information retrieval system to discover its own patterns in the content. A good system understands how to discover and extract specific grammatical entities from text—concepts, names, places, etc.—and then discover how often they are found together. This is "connecting the dots" to discover associations.

The final requirement is context. Context is used to dramatically improve the precision of an answer without compromising its quality. Context derives from four sources: the user, the social group they belong to, the application, and the information itself. The extracted entities we mentioned earlier are an example of information context. User context exploits information about the user or the user's profile, such as geographic position (where are you now?), age demographics, job title, and so on. For example, it is more likely that a query for "toxic" on a PDA service, where the customer is likely a teenager, will expect information about a song by Britney Spears rather than about Greenpeace. The social context is similar but it focuses more on the "groupthink." The application context uses the inherent bias of the application to affect results. If the context of the application is about celebrities, then a request for information about Paris Hilton should return stories about the person; if the context is lodging, then about a hotel in Paris.

Compliance is a tricky thing because it is episodic by nature and it dwells exactly where knowledge is hardest to come by. A disgruntled employee leaks insider information, a tape of Social Security numbers goes missing, a law is amended; all of these are incidents that require immediate response, and an intelligent retrieval system is the technology to do it, but wouldn't it be nice if we could predict them first and avoid them altogether? 

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues