The Importance of ‘Smart’ Taxonomy- and Ontology-Enabled Resources
As big data expands, machine learning advances, and integration of data becomes more critical, taxonomies and ontologies are seeing a resurgence of interest and usage.
At Taxonomy Boot Camp 2018, Deborah McGuinness, Tetherless World Senior Constellation Chair, & Professor, Computer, Cognitive & Web Sciences, RPI, looked at the importance of taxonomies and ontologies, and the strategies for creating ones that endure over time in a keynote titled “‘Smart’ Taxonomy- and Ontology-Enabled Resources.”
Taxonomy Boot Camp 2018 is a part of a unique program of five co-located conferences that includes KMWorld 2018, Enterprise Search & Discovery, Text Analytics Forum ’18, and Office 365 Symposium.
According to McGuinness, the previous models of labor-intensive, centralized vocabulary construction and maintenance do not mesh well in today’s interdisciplinary world.
McGuinness provided a real-world view of building and maintaining large collaborative, interdisciplinary vocabularies along with the data repositories and services they empower, using the example of the National Institutes of Environmental Health Sciences’ Child Health Exposure Analysis Resource (CHEAR), which looks at environmental exposures for parents and children and their affect on health. The goal of CHEAR is to encode terminology currently needed by the CHAR Data Center Portal, publish an open source extensible ontology integrating general exposure science and health leveraging best-in-class terminologies. This enables Findable, Accessible, Interoperable, Reusable (FAIR) data and services to support data analysis and interdisciplinary research.
Ontologies are critical as they encode terms and their inter-relationships, providing a foundation for understanding interoperability and reusability (the I and R in terms of FAIR).
Ontology-enabled infrastructures use knowledge graphs and ontology-enabled search services also provide support for finding and accessing relevant content (the F and A in FAIR).
Key points to remember in embarking on a new ontology project are:
- Start with the use case because they help with scoping and prioritization
- Taxonomies/ontologies enable FAIR data resources, and they can support movement across levels of abstraction
- Computer-understandable specifications of the meaning (semantics) support enhanced lifespan and impact of data
- Leverage ontology-enabled architectures
- Do not build taxonomies and ontologies from scratch, but instead selectively and thoughtfully reuse existing best practice ontologies/vocabularies
- Engage experts in choosing ontology (portions) and in designing the knowledge architecture
- Ecosystems and diverse terms are critical for success and community-driven and maintained ontology-based systems are the future
- Taxonomies and ontologies enable interoperability and actionable applications
While observing that terminologies need humans and automated systems alone are not good enough to rely on, McGuinness also noted, "I actually do want to put myself out of a job." Done well, she said, taxonomies and ontologies will have longevity. “The job I want is to kind of maintain the infrastructure and the governance process and I want the people who care about the terminology to be contributing to it and also to be buying into it.”
Many presentations have already been made available online at www.taxonomybootcamp.com/2018/Presentations.aspx and others will become available after the presentations are given.
KM World 2019 will be held November 5-7, 2019 at the JW Marriott in Washington, DC, with pre-conference workshops on November 4.