What is a Taxonomy?
A taxonomy is a "knowledge organization system," a set of words that have been organized to control the use of terms used in a subject field into a "vocabulary" to facilitate the storing and retrieving of items from a repository. These Knowledge Organizations Systems (KOS) are usually specific to a knowledge domain or a topical area, a subject area, or an enterprise area. In any language, we have many names for an object or concept. When we settle on a single descriptive label and as a main term and surround it with its synonyms we have the beginnings of a KOS.
A controlled vocabulary focuses on concepts. It's not the items themselves-not the specific items-but rather the concepts represented by words or terms. A thesaurus is a controlled vocabulary of terms in natural language order. It is designed for "post-coordination."
Post-coordination means that the terms are going to be combined at the time that the search query is made or question is asked. Pre-coordinated systems like library subject headings or back-of-the-book indexes combine the terms at the time the index is created. That is a big difference. We need to think about how that happens. When someone is doing "post-coordinate indexing" instead of cataloging, we get a big flip in the way the terms are created. Some people are indexing-doing the back-of-the-book indexing-where terms used are dictated by specific words in the text of the book, and others are indexing using a controlled vocabulary. The processes are quite different.
The terms thesaurus and taxonomy are often used interchangeably. You can think of a thesaurus as a taxonomy with extras.
On the left, you see the taxonomy view (click for larger image). We have a broader term and some narrower terms, and we have even top terms in the taxonomy view. This view is broader-narrower term driven so it is hierarchical in nature, and it is called the taxonomy view.
Enter the code KMAT at checkout for a 20% discount!
We move to a full thesaurus by adding relationships between the terms, notes and other features, adding the ambiguity control forming a synonym ring. A thesaurus has four major features; 1) the hierarchy (taxonomy), 2) the associative relationships (or related terms), 3) equivalence ambiguity control (Nonpreferred terms, synonyms, use references), and 4) various kinds of notes, history, definitions, status etc. The next step in complexity is to define those associative relationships in many different ways, i.e., as an ontology. The final option in the increasing complexity is the Linked Data or Semantic Web options where the actual items described in many different systems are hooked together by this KOS. Sometimes people talk about a taxonomy or ontology when they actually mean a thesaurus. Whatever the final product is called, it is still a knowledge organization system, just as classification systems are.
"Taxonomies" are part of an established standards area for thesauri. The Z39.19 Controlled Vocabularies standard published by ANSI (American National Standards Institute) and NISO (National Information Standards Organization) standard in that it is a "hierarchically organized vocabulary based on a classification system". You can download this standard at no charge from www.niso.org
This article was written for KMWorld by Marjorie Hlava, president and chairman of Access Innovations, Inc.
Companies and Suppliers Mentioned