Extending SKOS to manage complex taxonomies
SKOS, or simple knowledge organization systems, are often used when handling taxonomies for expansive enterprises. As complexity becomes the norm of modern industry, however, taxonomies begin to reflect this ramping intricacy, forcing SKOS-based solutions to reach its efficacy limits.
KMWorld held a webinar, “Five Key Considerations for Managing Complex Taxonomies,” to address this organizational challenge, which ultimately invites wasted time and resources in creating, managing, and using complex taxonomies.
Irene Polikoff, co-founder and chief evangelist at TopQuadrant, explained the fundamentals of taxonomies.
Taxonomies are a way of defining hierarchies of concepts, where properties or characteristics can exist at each level.
Taxonomies are used to support enterprise search and browsing, enabling knowledge to be organized in a fashion that leverages relevancy.
While taxonomies have an abstract definition, ultimately, its meaning depends on the way that they’re being used, explained Polikoff.
Using an example from an Amazon product search, a taxonomy can be more than just hierarchy; it can include a variety of attributes, such as brand, features, or departments. Using these properties to narrow product selection can all be different, making taxonomies even more complex.
When taxonomies are done right, they become a knowledge graph—an amalgamation of relationships between properties and product, including properties such as its seller or return policy.
These rich taxonomies are complicated, going beyond just hierarchy. To manage these taxonomies, standards are important.
SKOS, the W3C standard, is a common data model for knowledge organization systems such as thesauri, classification schemes, subject heading systems, and taxonomies.
“SKOS gives us some very basic set of properties for a concept. A concept can be anything; New York as a city is a concept,” said Polikoff.
SKOS distinguishes between two basic categories of concept relationships: hierarchical (broader/narrower) and associative (related). SKOS can be further explained as an ontology, leveraging concepts, classes, and subclasses to further define objects. When domain matter gets complex, however, SKOS-based solutions may reach their limits.
Ontologies become of even greater significance then, as they become the blueprints or schemas for data in a knowledge graph which allows domain-specific information to be added to the taxonomy model. SKOS can then be extended to include more information, specific to each class, which is key to creating interconnected, rich taxonomies.
Data quality also plays a critical role in rich taxonomies; SKOS data quality rules should be built-in, meaning, there can’t be two different preferred labels in the same language. Some of these rules are provided by SKOS itself, yet other commonly requested rules should also be pre-built.
For rich taxonomies to succeed despite its complexity, they require efficient collaboration and workflows to enable its operability. Taxonomies need consensus—meaning, they necessitate shared understandings that mitigate conflicting points of view.
Polikoff explained that the days of spreadsheets for taxonomies are over, as tools are needed to ensure proper governed collaboration and change management for organizations to imbue the data itself with the knowledge of its experts.
Furthermore, a role-based system is required for access and collaboration, including taxonomist/ontologists, library scientists, data governance lead, data architect, and many others.
“Workflows, in particular, can be used to create specific processes. And processes can be different for different types of taxonomies; depending on what group of people manage it, depending on how important it is, you can put in place different processes and involve different roles and stakeholders in giving approval,” said Polikoff.
When working with multiple taxonomies, especially if they’re large and spanning different types of resources, it’s important to think about layering taxonomies. Keeping them modular, yet connectable, allows each to evolve separately, be re-used in different contexts, and be governed by different taxonomies.
Ultimately, when taxonomies become rich and complex, SKOS will begin to lose its “simplicity”; equipping ontology, roles, quality rules, and layering can extend SKOS to better underpin an organizational taxonomy.
For an in-depth discussion of rich taxonomy management, examples, and use cases, you can view an archived version of the webinar here.