-->

KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

  • April 24, 2002
  • News

Categorically speaking

With new architecture said to optimized for enterprise-scope performance and scalability, SemioTagger 5.0 from Semio is the latest version of the company’s indexing and content categorization engine.

Semio says Tagger organizes and exposes information contained within all unstructured data--including e-mail, Web pages, Documentum databases, Lotus Notes, etc.--that exist across an enterprise.

The company adds that Version 5.0 can process millions of documents at extremely high speed with no trade-offs in granularity or accuracy, adding that it ships with a set of pre-built, industry standard taxonomy templates that enable nearly immediate indexing of a company's electronic data. Further, Semio adds, a new graphical workbench enables users to customize and fine-tune taxonomies and categorization processes to suit individual, department and enterprise requirements.

Tagger uses a series of patented, linguistic-based algorithms to identify and extract concepts (key phrases and terms) in documents and organize documents into categories using categorization rules that are visible and easily customized. SemioTagger exports XML-based document tags, containing category and concept information that easily feeds into relational databases, content management systems, document management systems, etc.

Semio says Tagger 5.0 features include:High-speed, high-volume categorization: The software can process tens of thousands of documents per hour from multiple formats and multiple sources.

Distributed/parallel processing support: SemioTagger is based on new architecture supporting distributed and parallel processing capabilities and is tuned to take maximum advantage of performance opportunities.

Optimized for workgroup, departmental and enterprise processes: It supports multiple taxonomies referencing a single data corpus.

Real-time categorization: New incremental processing enables organizations to crawl documents as each one arrives, automatically refreshing indexes each time documents are added or changed, rather than through batch-processing alone.

Completely customizable categorization process: New knowledge management and administrative tools give content administrators agraphical way to create, manage and monitor categorization processes and adjust rules for optimal results.

<>

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues