Finding the right stuff
Government organizations are improving their ability to mine their knowledgebases, streamline business processes and to analyze, integrate and protect their data.
By Judith Lamont
Search and retrieval capabilities have long been a key element in knowledge management. Information-laden government agencies must be able to search their knowledge storehouses or creating them will have been a futile exercise. Although in some respects a mature industry, information retrieval continues to develop and improve. Many search and retrieval tools now provide such abilities as concept searching, which eliminates the need to pick out exactly the right keywords; natural language searching to replace Boolean operators; and software optimized for Web access. In terms of time savings and completing projects, well-designed systems can pay for themselves many times over.
At the Naval Research Laboratory ( NRL,), space constraints dating back a decade provided the impetus for a digital library that initially served 3,000 NRL employees and is now being extended to 60,000 users at over a dozen Navy sites. The library system, called TORPEDO (The Optical Retrieval Project Electronic Documents Online), uses RetrievalWare from Convera to bring information resources to the desktop. It provides access to a large collection of NRL reports and conference proceedings, as well as to nearly a million articles from about 500 journals. At first, agency documents were imaged and indexed, but now they have been converted to electronic text through optical character recognition (OCR).
Users can browse for articles contained in particular journals or report collections, can limit by subject using a set of 25 major categories and can search full text for concepts. A dictionary that runs behind the scenes ensures that relevant terms will be incorporated into the search—for example, a search for “heavy water” will also find “deuterium oxide.” Input from users is an important part of the system and provided a convincing push toward intuitive interfaces.
“Our users told us clearly that if training were required in order to use TORPEDO, it would not be meeting their needs,” says Laurie Stackpole, chief librarian at NRL’s Ruth H. Hooker Research Library. An annual survey of user satisfaction is conducted, and opportunities are also presented within the TORPEDO system for making comments and suggestions. The library has even received feedback from people who have gone on to other jobs.
“Former employees have told us that our digital library is far more advanced than those they find in many other places,” says James King, a specialist in library information technology at NRL. “This provided part of the incentive to extend the service to other Navy facilities.”
The benefits of the library’s digital delivery capabilities were predicted on saving each researcher an hour a month—time that would have been spent going to the library. Surveys have shown that on average, desktop access is actually saving two hours a week. That translates into $7 million a year in increased productivity, based just on the initial group of 1,500 NRL researchers. Among responses from users was a comment that without the library’s digital systems, such as TORPEDO, the researcher would not have obtained a $500,000 grant for a project. Thus, the system is not only producing cost savings but also increasing business opportunities.
Multimedia takes hold
Although most government information is in text and image format, video is gradually taking on a bigger role, particularly in training. At Sandia National Labs, personnel in the Accident Response Group (ARG) were obtaining part of their training by viewing video tapes that show teams performing different roles during exercises that simulate nuclear incidents. The tapes were stored in a library to which the trainees had to travel and then locate the desired training segments on the tape. Tapes are now being digitized and indexed with Convera's Screening Room.
"Users can view the video segments from their desktop," says Mike Krawczyk, a system analyst at Sandia who is managing the digital video library. "The savings in travel time alone is a significant benefit."
Although Screening Room provides automatic speech-to-text conversion, Sandia opted to manually transcribe the words spoken on the tape in order to achieve a higher level of accuracy. Either way, the resulting text is searchable, and with Convera’s Adaptive Pattern Recognition Processing, typos do not prevent successful searching. Sandia is also using Screening Room for its Knowledge Preservation project, which is designed to capture the expertise of retiring nuclear weapons scientists (KMWorld, August 2001).
Screening Room creates a number of value-added elements during processing. For analog video, the first step is to digitize it. After it is digitized—or if the video is already digitized—further processing results in a storyboard that indicates where major scene changes occur. Searchers can then skip to the point in the video that is relevant to their needs. They can also select a key frame and use it as a search clue to locate other frames of a similar color, texture or composition. If the video has closed-captions, those can be extracted and stored in a database along with the other metadata associated with that asset. Users can also add annotations to specific clips or frames to further enhance the search process.
Some inroads into real-time video analysis have been made in the security field, where digitized surveillance is being used in a number of government organizations. With technology developed by Zone Products (zoneproducts.com), a digital video camera buffers images but does not store them unless an event occurs, such as the opening of the door, in which a specified number of pixels changes. Recording starts at that point, and because of the buffering, it can be extended back in time to a point prior to the event.
Zone Products’ technology allows direct searching of the video; for example, a user can highlight the doorway and search for a point in the digitized recording where the event occurred. John Williams, CEO of Zone Products, predicts that “digital surveillance and monitoring are a first step toward comprehensive security solutions,” but many steps remain before digital video can play a role in KM based on direct access to its content.
Access to all types of multimedia data is becoming an increasingly important part of search and retrieval, according to Dan Agan, senior VP of Marketing and Corporate Development at Convera. “Knowledge workers need a powerful yet simple way to find the right information at the right time,” says Agan. “Our software allows them to do exactly that, regardless of whether the information is in text, image, video or audio format.”
The new CRM—citizen relationship management
Government organizations at all levels have gotten the message that their citizens want good service, and some are turning the message into action. In Nevada County, CA, a new system based on e-Work from Metastorm is poised to make citizens’ interactions with county government significantly easier. County CIO Steve Monaghan is directing an ambitious plan to automate virtually every aspect of his county’s administration.
Although the rural county located between Sacramento and Reno has a relatively modest population of 92,000, requests made of its 30 departments can create difficulties for both citizens and county staff. “Our citizens come in from every angle,” says Monaghan, “and for staff, trying to keep track of their requests is not an easy task.”
The new system will operate much like a job ticket, in which each request gets tagged, receives an ID number and is time-stamped before being automatically routed to the proper individual or department for action. The e-Work system also will convert manual processes such as time card preparation into an electronic process.
Savings from the time sheet application alone are expected to pay for the entire e-Work system within a year by eliminating a three-part paper form and reducing the time required to fill out the forms. But more important, Monaghan believes, is the resulting improvement in customer service. That shift will not result just from new software, but will require an accompanying change in philosophy. He notes that the technology issues are far less of an obstacle in the implementation than the cultural changes in the organization, where accountability has not been an explicit part of the county’s administration.
Primary strengths of e-Work are its ability to track sequences of events and to integrate data that drives those transactions. “e-Work can be used for anything from correspondence management to personnel leave requests or weapons change management,” says Avi Hoffer, Metastorm’s CEO. Hosted in either an e-mail client or browser, e-Work allows user actions from within those environments. Although Hoffer recommends that integration be done through Simple Object Access Protocol (SOAP), e-Work supports protocols for a variety of other standards.
Hoffer emphasizes the importance of understanding what goes on behind the business processes that are automated. “One organization decided to save time by eliminating a step in a process because in every case it was approved,” Hoffer recalls. “Actually, the process ended up taking longer, because the individual at that step was removing errors from the information before forwarding it to the next step.” e-Work uses a visual modeling tool to represent the process during development. After participants agree on the steps, moving from the model to a functional application is straightforward.
Analyze this, grade that
Since all the records of its activities will be electronic, the e-Work system in Nevada County also will allow analyses that have never been possible before. “We will be able to do some powerful things to give meaningful information to managers,” says Monaghan. When the option for citizens to report such problems as potholes is implemented, for example, the county will easily be able to determine how quickly such requests are fulfilled.
Online analytical processing (OLAP) and data mining are becoming more important within the government as it increasingly uses its information to make data-driven decisions. Government organizations also are initiating efforts to provide analytical capabilities directly to citizens. In Ohio, the Department of Education posts an Interactive Local Report Card (ILRC) that provides school district performance information. The interactive tool is powered by MicroStrategy Web, a Web-based OLAP tool often used for business intelligence applications. Users can select a district and the variables in which they are interested, such as expenditures per student, and then run a report online. Those who prefer to access reports that have already been run can download PDF files, as well as Excel spreadsheets that allow offline data analysis.
Data integration undisputed
The pros and cons of integrating disparate databases are being vehemently debated in the government and elsewhere, with homeland security and privacy issues each weighing in. Some integration efforts, however, are not producing controversy. The Colorado Judicial Branch needed a solution to capture and flow data in real time to a number of computer systems within its purview. It selected Transformation Server from DataMirror to meet its integration needs. DataMirror offers a family of software products that provide live data integration, resiliency and auditing.
One of the organization’s key applications is the Colorado Integrated Criminal Justice Information System (CICJIS), which links five agencies to more effectively track offenders as they move through the state’s criminal justice system. Information is provided in real time so that the status of offenders can be determined at any time. Colorado Judicial Branch also is using Transformation Server to implement its LEXIS/Courtlink system, the first statewide system for electronic filing of court cases.
In addition, Transformation Server forwards a subset of the Branch’s data to a vendor who hosts a Web-based public access system. Forwarding the data to another system ensures the integrity of the original data while providing citizens easy access to appropriate information. Overall, the DataMirror solution integrates over 1 million transactions each day and provides data to thousands of users within the court system as well as to the general public.
Without a doubt, security has attained a new level of importance in government. The more the Internet is used, the greater the need to protect the data and processes on which users have come to depend. "Security is viewed as a central part of doing business today," says Dr. Bassam Khulusi, president and CEO of ERUCES. "Great strides have been made in protecting networks with firewalls, VPNs and authentication techniques." But, he adds, often the data at the center remains vulnerable.
ERUCES’ Tricryption Engine uses a three-step encryption process that is designed to protect data from back-door access by hackers both outside and inside an organization. In the realm of IT security, most data theft is either classified as back-door or front-door. Back-door refers to attacks in which access is gained by circumventing established security procedures, whereas front-door means that the data theft was achieved by someone with authorized access or someone who assumed the network identity of an authorized user.
Khulusi compares the Tricryption technique used by this product with others: "When data is encrypted, no one really tries to break the algorithms--they look for the key and if it's found, the data is compromised." The Tricryption Engine encrypts a new key for each data transaction, encrypts the key and then encrypts the link between the two. That prevents even inside users from accessing the data unless they have proper authorization. Khulusi stresses, however, that hackers who circumvent approved company authentication measures can still go through the front door as authorized users. Therefore, multilayer security is essential.
With so much encryption, how is system performance affected? Khulusi reports that in testing, little degradation in response time is seen—about 4%. The Social Security Administration (ssa.gov) ran a proof of concept test in December 2001 and found that the product demonstrated a method for significantly improving security. The Tricryption Engine is also being sold to healthcare organizations to help them comply with requirements of the Health Insurance Portability and Accountability Act (HIPAA).
Judith Lamont is a research analyst with Zentek Corp., e-mail firstname.lastname@example.org.