-->

KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

  • March 6, 2002
  • News

Structuring the unstructured

ClearForest announces ClearTags 4.0, an auto-tagging platform that includes semantic, statistical and structural tagging.

The company says the new offering greatly expands tagging output and, in turn, the understanding and value of unstructured content. ClearForest adds that the user control panel allows the definition of different tagging schemes for any type of document stream, monitoring the entire tagging process. ClearTags discovers multiple relevant entities, facts and events buried within large textual repositories, richly tagged XML files, facilitating data reuse and the manipulation of content with other applications.

ClearTags accepts input in a variety of formats, including PDF, MS Office, HTML and XML, and automatically provides each document with an extensive set of relevant metatags. The tags are based on three main technologies:

  • semantic/linguistic information extraction, which extracts key entities, events and facts;

  • statistical categorization, which assigns documents to categories/topics; and

  • topological analysis, which identifies terms based on document structure.

The company adds that ClearTags is also used to generate a ClearForest knowledgebase to be used downstream with ClearForest’s suite of business intelligence products such as ClearResearch or third-party Web applications.

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues