-->

KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

The next wave of big data technology: distributed automation

Article Featured Image

“Business is fundamentally a multi-enterprise activity,” Najmi says, “so why put application silos within a single enterprise?” Blockchain can also unite data resources between organizations to enable access to a distributed network. “Blockchain is raising the awareness of why business is done better when you go multi-enterprise, eliminate information latency, share information instead of move it back and forth, and look at a bigger picture instead of your own silo,” Najmi says.

Single view, uniform access

As Hung explained, physically implementing the means to view, access and control data horizontally across environments is possible with several approaches. The primary limitation of leveraging blockchain for that purpose is a dearth of standards. According to Silver, that fact is indicative of the technology’s maturation process and “standards will evolve” in time. Najmi says that such an evolution of standards may evolve “domain by domain perhaps.” Nonetheless, a number of mechanisms are in place for orchestrating the movement and access of data in a unified manner across heterogeneous environments. Some of the more mature involve:

  • Enterprise knowledge graphsThose linked data repositories enjoin information assets on a semantic graph spanning the enterprise. According to Jans Aasman, CEO of Franz, the crux of that approach is to take the myriad databases in which enterprise data is stored and “turn them into RDF repositories. But here’s the big difference: In those linked data repositories, you make sure you use the enterprise ontologies, taxonomies and terminology so automatically you can combine data from every linked data repository.” The resulting data is shared on a single graph, enabling uniformity of access and oversight of data. That approach is employed by big data stalwarts such as Google and Yahoo.
  • Enterprise data fabricsReal-time necessities of big data involving the IoT, AI and blockchain require an expedience equitable to their automation. Culling data sources quickly and moving information between them for time-sensitive applications may best be achieved by the means of a holistic data fabric or data tapestry. The methods of producing that effect vary. Virtualization technologies can “stitch” data together across architectures for that purpose. Other methods rely on a knowledge graph approach fomented in the cloud with a centralized controller containing metadata about the data and its movement. “The data fabric is an access mechanism,” Aasman says. “The underlying layers have to figure out how to get at the data. It might be already in the cache, it might have to be computed, you might have to get it from the cloud somewhere.”
  • Comprehensive platformsThe major big data vendors, such as MapR and Cloudera, have increasingly turned their attention to creating holistic platforms designed to unite all data in a single framework. Key advantages of that approach include intrinsic capabilities for streaming data, data in motion and sharing data from different environments. Moreover, their platforms involve the actual data instead of virtualized data used in other approaches that “just mask the complexity instead of dealing with the complexity,” according to Jack Norris, MapR SVP of data and applications.

Single trajectory

The aforementioned approaches are significant because they indicate the big data space is unambiguously veering toward decentralization. Vendors with different specializations—storage, virtualization, data preparation and others—are devising solutions to deal with big data’s disparate nature. The catalyst for that movement is due in larger part to big data’s contemporary form, which is inexorably reflecting manifestations of AI, the IoT and Blockchain. Organizations are no longer simply tasked with incorporating those technologies for competitive advantage. Tomorrow, they must do so holistically to streamline the distributed processing of big data assets for optimal results.

“Blockchain is raising the awareness of why business is done better when you go multi-enterprise, eliminate information latency, share information instead of move it back and forth, and look at a bigger picture instead of your own silo.”

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues