KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

Capture: a state of change

Paper capture software and services are growing at nearly twice the rate of traditional document management systems, according to our research at Harvey Spencer Associates. The connection that used to exist between document management and capture (after all, the paper that is managed in DM systems has to be converted somewhere) does not seem to hold any more. Of course, some parts of capture--like formatted full text and, to a point, forms processing--were in a different space, but much capture was for indexing or ended up in a DM system. So why are they diverging and why is capture growing so fast?

There are two main reasons. The first is e-business, which is driving the fast growth of capture, and the second is the maturation and expansion of EDM, which is contributing to the slowdown in traditional DM revenues.


A few years ago, a scanner manufacturer presented itself as a "knowledge capture hardware supplier." We all laughed, but it turns out that maybe with some of the new software tools in use and being developed, there is some truth to it. E-transactions come into the business in a form that can be automatically interrogated, classified, prioritized, categorized and routed. Search tools let them be grouped together and found.

On the other hand, paper is not automation friendly. Paper is received in an organization both at centralized collection points and locally at remote sites. In most volume-oriented EDM or ECM environments, it is collected and sent by courier to centralized scanning sites, where it is batched, scanned and indexed. Typically that might take a couple of days, and the information captured is limited to the fields that were indexed or OCR (Optical Character Recognition)'d.

Contrast that with e-documents. They arrive, are captured and locally stored in the e-mail server where they can be quickly searched and located, and are routed. Based on the content and/or attachments, they may have keywords extracted, which can be used for indexing or management. It can all happen in seconds.

Electronic transactions create their own dynamic--business is transacted faster and therefore customer expectations have increased. To remain competitive, a company must try to process all their transactions at electronic speeds. Consequently, companies are rapidly changing over to electronic transactions--latest statistics from the U.S. Commerce Department show year on year changes ranging between 12 percent and 20 percent, with e-commerce now accounting for up to 30 percent of a segment's business.

But paper has not gone away and it probably never will. Even when the current e-transaction concerns that are associated with the legality of e-signatures, portability and security have disappeared, some transactions will remain on paper. Old technologies never die, they just fade away. Look, for example, at telex--I forgot about telex 20 years ago, but it is still being used for trade transactions involving less developed countries, because it is reliable, secure, understood and has precedent. Sound familiar? My best guess is that paper in business transactions will stabilize at about 25 percent of its peak in 10 to 15 years.

The result is increasing pressure to manage paper as effectively and quickly as electronics. The way to do that is to convert the paper as quickly as possible to an electronic rendering.

We are seeing the pressures increase--in 2004, shipments of distributed workgroup and low-end scanners increased dramatically, according to InfoTrends/CAP Ventures (capv.com), while shipments of high-end centralized scanners were stagnant. At the same time, the interest in using shared digital copiers (MFPs) for scanning has soared. The conclusion is that more users are finding a need to convert paper documents to images at the point of entry into the organization.

Maturity and expansion of EDM

Expansion of EDM has been going on for a number of years, and the industry now positions itself as ECM (enterprise content management). That effectively assumes management of all the content in a corporation whether electronic, paper or some other media. In order to implement this, the ECM companies have needed to get more and more embedded in the backend processes of the corporation, effectively sitting on top of or by the side of ERP systems. That has resulted in larger average deals and increasing revenues from services, but less growth in products and a move from their original "services" base into other verticals such as manufacturing. From a scanning and capture standpoint, it has meant less paper in the mix and lower sales.

These changes are resulting in capture becoming more ubiquitous throughout organizations whether small, midsize or large. Scanning either takes place centrally if the paper is received centrally, or it is distributed where scanner operators or ordinary office "knowledge" workers operate the machines. Those operators of distributed scanning must not only scan the papers, but also interpret them into usable data, which can be time-consuming. It can become extremely expensive, particularly if the person scanning the pages is an accountant or claims adjuster who is paid more than a clerk or scanner operator.

The shared office MFP is designed as an ad hoc scanning, printing and copying device. They are operated by occasional users who do not scan and index regularly. A number of vendors have developed solutions to address that situation, usually based on barcoded stickers placed on the documents or coded separator sheets designed to automatically index the documents. Other vendors have created simple ‘one-button' solutions often by incorporating a PC and large touch screen with the MFP.

Those are reasonable solutions based on current technology--but change is about to occur. Using what we call intelligent document recognition (IDR), it is possible to automatically identify, characterize and extract key data from the image of a paper document--effectively employing some of the same techniques and background knowledge that a person uses to identify a document and extract required data.

That IDR capability, combined with ad hoc scanning devices, is set to enable fast and accurate capture of paper documents as soon as they enter the corporation. Paper will never go away, but tools are appearing to convert it into e-usable data quickly and automatically. Paper becomes nearly as effective as e- transactions, captured and stored in the server where they can be quickly searched and located. Routed and based on the content and/or attachments, they may have keywords extracted, which can be used for indexing or management. Search tools let the pages be grouped together and found. It can all happen in seconds--just like e-mail. No wonder we are in a growth spurt--there are still huge amounts of paper used in business every year. It makes for a lot of capture.

(Harvey Spencer Associates is holding a seminar entitled Document Capture 2005 on Sept. 8 in Glen Cove, N.Y. For more information, visit documentcapture2005.com.)

Harvey Spencer is president of Harvey Spencer Associates, e-mail harvey@harveyspencer.com.

KMWorld Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues