E-government: enhancing national security
Since 2002, the Naval Postgraduate School's Center for Homeland Defense and Security (CHDS) has developed a wide range of graduate education programs that help current and future homeland security leaders with strategies, policies and organizational elements to defeat terrorism in the United States. The CHDS, which is sponsored by the U.S. Department of Homeland Security, offers a homeland security master's degree that was the first such program in the country, and is attended primarily by senior local, state and federal DHS officials.
An important part of the program was the establishment of the Homeland Security Digital Library (HSDL), which contains documents on policy and strategy relating to homeland security. It is used as a research tool that enables students and faculty to explore decision-making processes post-9/11, and as a way to help federal, state and local government agencies develop policies for the future.
"We wanted to create the best archive we could," says Thomas Mastre, HSDL project manager, "with a wide range of documents that could be augmented continually." Most of the students participate in the program via distance learning, so the library also needed to be Web-based to provide for access from the learner's location.
An initial taxonomy was set up in fall 2003 when material for the HSDL was being placed in three different repositories, and as a "living" taxonomy, it has developed over time. One repository is a high-value set of documents that were collected by researchers and subject matter experts in the field. Another includes content that was found by crawling for relevant information on Web sites. The third is a specialized set of newspaper clippings that is restricted and updated frequently. The taxonomy was developed by human experts, and the documents in the repository were manually categorized at first.
As the volume of information grew, the need for an automated method of classifying documents with the taxonomy became more pressing. After considering a number of options, the HSDL Project Team selected Teragram, which produces a number of language processing tools. The HSDL makes use of the company's auto-classifier and taxonomy manager products. A combination of automatic and rules-based classification techniques provided by Teragram's TK240 Taxonomy Management software is used to place the documents into the correct categories of the taxonomy.
"One of our requirements was for a classification system that could evolve," says Tom Mastre, project manager for the HSDL. "Every week we add about 200 to 300 new documents, and sometimes we need to add new categories, which we can do with Teragram." In addition, categories that have grown too large may need to be broken down into sub-topics and retagged, which is another available option.
Users can browse through the documents using the taxonomy, but can also search the collection using technology from Fast Search and Transfer. Results from searches are presented using relevancy ranking. Users also have the option of searching within categories of the taxonomy, which helps eliminate documents that are not within the searcher's area of interest.
Feedback from users has been very positive. "If you are doing research in homeland security," says Mastre, "this is the place to come."
The ability to browse topics allows users to find information that they had not thought to search for. In addition, the taxonomy lets people make connections that might not have been apparent. For example, when browsing through a heading on borders and immigration, the user sees expected topics such as border security, but also sub-topics such as politics and government. Once that area is identified, the user can explore it for additional perspectives on the broader topic.
The value of the HSDL goes well beyond the academic program that was the catalyst for its development. Officials in state and local governments would not have been able to collect the large number of documents (several hundred thousand) contained in the HSDL, and if they had, such duplication of effort would have been very inefficient.
"This library constitutes a great multiplier force," says Mastre. "We go through the collection process once, and the HSDL then benefits government organizations, both civilian and military, at many levels."
Flexibility in tagging is one of the biggest advantages of Teragram's classification technology when it comes to managing dynamic collections of documents, according to Yves Schabes, president and co-founder of Teragram.
"Being able to do retrospective classification is helpful in many situations," he says. "In addition, Teragram's auto-classification can handle transient topics, such as Olympic games that come up only every four years. During the intervals where there is no significant activity, those portions of the classification process can be disabled, to simplify the taxonomy." The technology can be embedded into a search product, in addition to being provided as a separate software tool.
One of the keys to success in innovation is to avoid duplication of effort. In the U.S. Air Force, a new knowledge management system is designed to coordinate knowledge sharing among many innovation communities that are working on new technology or solutions.The Innovation and Technology Knowledge Management site was launched in April 2006. It resides within Air Force Knowledge Now (AFKN), which is accessed through the Air Force portal.