Mining the COLD warehouse for business intelligence
Computer documents represent the primary corporate memory in today's environment. They are the current and historical reference to internal corporate activity as well as the primary method of communicating with customers. COLD (computer output to laser disc), or enterprise reporting as it is beginning to be called, has become the standard for the electronic storage, retrieval and on-demand printing of internal reports and outbound customer documents.
Although many industry observers believe that online record-oriented systems will soon obviate the need for computer output reports, others believe that the document will remain the output standard. That argument cannot be resolved here, but it is a fact that documents will continue to be required for legal and accounting purposes. Online databases do not substitute for the "point-in-time" legal and accounting record of an individual transaction. Mandated life cycles for customer documents such as phone bills, utility bills, invoices and bank statements continue to exist, often in excess of seven years.
Documents, however, have been of little value when a user wants a different view of the information contained in page format. That has been the flaw in document-oriented output-it's frozen data. There has been no easy way to analyze the data to obtain other business intelligence contained in the document. That has changed, though; new software tools let heretofore static print output be repositioned for data mining or, as it is known by COLD vendors, "report mining" applications. The COLD document warehouse has new and significant information value.
Report mining-an alternative for data analysisThe concept of report mining uses an existing COLD document repository as the information database. Consider the following:
- COLD systems store documents in the same page format in which they were originally created. That means that the data is in known locations-in specific rows and columns on a report.
- Documents are a compilation of the database at a point in time-data to be analyzed exists on the application output document.
- Report output provides data that is "clean;" it does not contain the inconsistencies that exist in the core database.
- COLD systems store data for long periods of time, thus providing a rich warehouse of history.
Report mining-searching, locating and extracting specific information-can be accomplished from either internal reports or outbound customer documents, thereby leveraging the known row/column document (data) structure. Different output can be searched independently and the results pooled to create the necessary database of information.
The document is the database
For example, a bank statement contains all the relevant information about a customer's transactions for a given period (normally a month). Furthermore, since the data is located in a known position of the statement, a search that is limited to specific rows and columns ensures that only the relevant data is located. (Unlike a standard text/string search, data not related to the query is ignored.) For example, a bank statement "deposit" column can be searched and the "withdrawal" and "account balance" columns would be ignored. Since a key benefit of a COLD system is history, the search can be made over an extended period-two or three years or even longer. Analyzing historical customer activity for trends becomes an important added value of the COLD system. A number of major financial institutions have installed COLD systems with the key objective of using the statement database to search and analyze customer activity over extended periods. The COLD customer service system does double duty by becoming the data warehouse, which can be analyzed with a wide range of data mining software tools. That ability to reuse documents further enhances the strategy of the document as the dominant record in the organization.
Reordering, analyzing, manipulating and creating a new view of data on the fly has become a powerful COLD system byproduct. Report mining requires no new database, no re-engineering, no additional human resources and no significant additional cost. Because report mining software leverages the existing report infrastructure and information delivery system, it is simply "added value" to a COLD system. As an example, take the case of a credit card product manager who is asked to present a report on all customers in San Francisco living in zip code area 94133 who have generated more than $5,000 per month in charge card revenue during 1997. That type of information is generated from a data warehouse. It is also information that can be extracted from monthly statements stored in a COLD system using report mining software.
Three facts make it worthwhile to consider a report mining strategy:
- Report mining is essentially a free byproduct of the COLD system.
- Data warehousing and data mining are extremely expensive with long implementation times.
- Data quality is one of the major cost elements of data mining-output reports represent high-quality database information.
Report mining opens the door for a new use of computer output. Other examples include:
- searching ATM (automated teller machine) historical transaction reports to ascertain customer trends and use of specific ATM features.
- searching brokerage statements to provide a summary of purchases of a specific stock by a given client over any time period, thereby providing a personalized service.
- tracking loans from monthly statements by geographic location to determine areas where there is significant loan exposure or where loan activity is low.
- extracting customer data from statements with late payments for matching with credit scoring history to determine the accuracy of the initial screening process.
- performing product sales analysis on invoices.
Leveraging the COLD system to provide low-cost and high-value information may even prove to be a more significant benefit than that of the original application for which it was intended.
Full-featured report mining software will allow the user to:
- search across multiple reports or applications,
- search across an unlimited time period,
- create a new consolidated database from multiple reports,
- create and save templates of standard searches for repeated use,
- sort and rearrange data in any desired manner or format,
- create new totals and subtotals from columnar data,
- export data to any commercially available spreadsheet or database
- create charts of the results of the new summary data.
Some versions of report mining have been integrated with online analytical processing software (OLAP). That allows columns or rows of data to be migrated from the COLD document warehouse to high-performance data mining software for even more sophisticated applications. A COLD system, in fact, can be implemented quickly and easily for a wide variety of enterprise data mining applications. A COLD/enterprise reporting system is not necessarily a replacement for a data warehouse, but it may fulfill many of the expectations of a data warehouse at a fraction of the cost and with minimal implementation time.