-->

KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

Automated Understanding-The Future of Big Data

"Data is the new oil"

— Media Futurist Gerd Leonhard

The amount of data in our world is exploding and what we can do with that data has become one of the hottest business and technical topics of the decade. Pundits are talking about how data is becoming a new currency. Innovative firms like Google and Facebook understand the power of data and are using it to generate new business. Government is using big data to help ensure national security. Major enterprises are using big data to address their challenges with risk, fraud and compliance.

Organizations have increasing amounts of information about people, customers, suppliers and operations. It can be found in surveillance data, in operational data, financial data, scientific data and research notes, as well as in social media. It is changing the way we work and the way we play.

What is Big Data?

Big data essentially means we are so good at capturing and transferring information and replicating it digitally that it's substantially outstripping our ability to organize it, let alone analyze, make sense of, or take action on it. 9/11 was a wake-up call to the criticality of understanding the intelligence buried within the data-where speed, quality and scope of "understanding" can literally mean life or death. In this situation more data was not necessarily the answer. In fact, government intelligence had an abundance of data but was not able to connect the data together quickly enough and in a meaningful way.

We can no longer cope with understanding unstructured data manually in a big data world. It is too big (volume), accumulating too fast (velocity) and is found in many forms (variety). Hidden in this data are unknown and potentially valuable relationships between people, place and time. Understanding these elements of data in context is critical to achieving useful information. These relationships are "hidden" because tools we have today struggle to provide context along with the data elements.

If you know exactly what you are looking for there are a number of solutions that can help you—from keyword search to entity extractors. But what if you don't know what you are looking for? What if there is information available that you didn't know existed? How would you even know what to ask for? Too often, today's solutions will simply (or not so simply) provide another way of producing a more refined list of documents where the answers may be located—but in the end the user is still confronted with the challenge of "reading to understand."

Why is "Automated Understanding" so Difficult?

What does it take to truly understand something and connect the dots? Eighty percent of our time is spent on the process of awareness, reading, relating and comprehending while 20% of our time is spent on inference, interpretation, prediction and creation. The challenge is to automate the piece that consumes 80% of our time, freeing us up to be more creative and productive.

This automation requires a radical departure from some fundamental practices. First, we will need to move from "document-centric" solutions to "entity-centric" solutions. In other words, solutions that seek to extract the key facts and relationships from the material, thereby fundamentally reducing the amount of "reading to understand" activity by the end user. Second, we must move to a solution based on "machine-learning" where there is no a priori definitions (e.g. taxonomy or ontology) required in order to achieve understanding. This second component would radically change the amount of data preparation and maintenance required before unstructured data analytics could start (or be effective). And of course, it will be necessary for these solutions to implement the latest cloud-scale technologies to scale to meet the volume and velocity of big data today.

The Importance of "Self-Learning" Applications

We will not be able to solve the problems of complex big data without self-learning applications—a class of applications which means the more data it is exposed to, the smarter it would get, and therefore, the more productive that you would get with it. Today's applications that we use all the time, like browsers and word processors don't get much smarter the more that they are used. But applications like Siri do get smarter as they are used. These are the applications that are fusing human communications with machine learning and these are the applications of the future.

More specifically, for big data analytics, we need to develop algorithms that find patterns in human language that would allow us to build up meaning from symmetries in data, from text, so a computer could bootstrap things that are normally engineered like ontologies and semantic models. This type of intelligent, self-learning software forms the technical engine for automated understanding.

The automated understanding engine ingests structured and unstructured data in multiple languages. It fuses data from multiple sources and types. It is then processed through a number of integrated functions that enables awareness, reading, relating and comprehending.

The results of automated understanding are:

  • People are understood in space and time;
  • Connections/relationships become visible and in context; and
  • These elements of knowledge are made available to applications directly.

Automated understanding is the next wave of analytics. It will take away the heavy lifting tasks like awareness, reading, relating and comprehending, for the analyst leaving time for the creative and productive analyst duties such as inference, interpretation, prediction and creation.


For more information about Digital Reasoning or Synthesys visit www.digitalreasoning.com or send email to info@digitalreasoning.com.

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues