Disrupting the data landscape again with linked open data
One of the most significant consequences of the big data disruption is the need to rapidly integrate and exchange data of all types among disparate systems, applications and infrastructure. As the focus inexorably shifted from internal, proprietary datasets to their aggregation with external, publicly available sources, organizations can no longer content themselves with tools exclusively rooted in relational technologies.
According to David Price, managing director of TopQuadrant: “Big data shattered the paradigm where organizations would only use methods designed for a single relational database. They suddenly needed unstructured data analysis for big data, along with relational methods to find the proverbial needle in the haystack and a linked data approach to connect things and see how they relate.”
Subsequently, linked data—manifest as both linked open data and its private sector counterpart, linked enterprise data—has enjoyed a resurgence so utilitarian that it could very well disrupt the data landscape again. Machine-readable public data sources have abounded across verticals, become prioritized by the federal government during the previous presidential administration and are regularly aggregated with proprietary sources for a holistic view of organizational interests.
Such activity, however, only hints at the core functionality for which linked data is prized after the onset of the big data epoch and does not begin to enumerate the true value of its interoperability across the data ecosystem. Specifically, the linked data approach is rapidly gaining credence throughout the public and private sectors for its ability to:
- share data—Linked data is readily exchanged across any assortment of hardware systems, applications and use cases in a singular fashion, greatly simplifying and expediting integration while largely automating analytics prerequisites such as transformation. Semantic web founder Tim Berners-Lee stated, “With linked data, when you have some of it, you can find other related data.”
- improve governance—Predicated on uniform modeling conventions known as ontologies, linked data technologies clarify data’s meaning according to uniform standards that naturally evolve at the pace of business. The resulting heightened governance capabilities are ideal for ensuring compliance in heavily regulated industries such as finance or healthcare.
- implement machine intelligence—Linked data’s inherent machine-readable nature substantially impacts the enterprise’s ability to scale at speeds commensurate with real-time big data ingestion and is timely for the rejuvenated interest in artificial intelligence.
- unify silos—The penultimate boon of the linked data approach is its ability to end the silo-based culture rampant throughout the data sphere, which fundamentally delays time to insight and action, increases costs and renders data’s meaning abstruse.
Those reasons, and others, are why many organizations in both commercial and public spaces are realizing that “it’s worth an investment to take the data out of the individual applications and make it available in a more mutual, standard format,” Price says. “Applications are going to come and go but if they’ve got this standard format, it helps them manage these long lifecycles in a better way.”
Linked open data’s sustainability
The most visible incarnation of linked data’s increasingly conspicuous presence, and the one that is most profoundly reshaping the data sphere, is the public sector’s linked open data effort. Largely fueled by the need to swiftly integrate and share data across different entities in a sustainable manner, linked open data is the most viable means for effecting such exchanges among and within different countries, languages, sectors and organizations. The crux of the linked data approach is the uniform standard of the W3C Consortium, which seamlessly harmonizes all data regardless of structure, source or type. Those all-encompassing, evolving data models and standardized vocabularies create pervasive adherence among disparate data, spurring a range of pragmatic possibilities fostering the trust and consistency needed of data-driven applications in the public sector. The growing list of linked open data sources includes data from the U.S. Congress, World Bank, British Geological Survey, U.S. Securities and Exchange Commission and linked sensor data sources.
For the past several months, Price has been assisting both the Swedish and Dutch national road authorities with the European Virtual Construction for Roads (V-Con) project, an intricate exchange of linked data pertaining to road network management, updates and construction in Sweden and Holland. The implementation of uniform, semantic standards within and, in certain cases, between those countries is a multifaceted process targeted at the organizational, national and industrywide levels. The standards-based approach of linked data is the most viable means of sharing data among the construction companies, government entities that own the roads and their national governments. Most importantly, the utilization of semantic technologies ensures the long-term continuity of the underlying data vital to that construction work, regardless of shifting personnel, infrastructure or tools.
Such enduring relevance of data and their technologies in the rapidly shifting world of IT, in which legacy systems quickly become outdated, is a cardinal virtue of linked data—and all but impossible with any other approach. On the continuous value linked data produces in this respect, TopQuadrant CEO Irene Polikoff says, “The long-term sustainability and reuse of linked data processes is one of the reasons it has become more attractive today, because otherwise you would have to do everything over: the schema, the modeling and transformation. Constantly doing that work takes too long and proves much too expensive over time.” Ensuing benefits include decreased total cost of ownership, a lasting means of leveraging data assets and a drastic reduction in the instances of legacy silos.
Standards-based, 3-D modeling
The sustainable longevity of linked data stems from the implementation of consistent meaning across systems—regardless of type or source—largely because of its standardized models. Those ontologies are not only responsible for modeling all data in a uniform way, but also naturally evolve to include additional requirements or data types in a singular method without the inordinately lengthy recalibration of schema required of relational models. Furthermore, those semantic models can be implemented at varying levels to ensure the consistency vital to expedient data exchange and interminable reuse at scale. As such, a large part of Price’s work with V-Con has involved “helping organizations use our tools to create models with 3-D capabilities to share data between the different organizations.”
The totality of the expressiveness of those models is not only ascribed to their 3-D attributes, but also to the various standards to which they require data to conform. Price described the latter as a “layering of multiple ontologies according to the different countries, their subsets and in some cases, for each organization.” The merit of the linked data approach in this regard is that the data still preserve their defined meaning at each of the respective levels, yet are quickly exchanged between and understood by the different IT systems found therein. Moreover, the expressiveness of the ontologies is considerably enriched by “the 3-D characteristics, which are extracted from the models,” Price explains. “We’ve added a widget to the browser that users can click on and actually see a bridge or a road.”