The voice of the author
Spending some time on Wikipedia rereading the definition of cognitive computing created by the Cognitive Computing Consortium, I found on reflection a number of keys to the definition’s uniqueness buried with the text. The key paragraph from my perspective is:“Cognitive systems differ from current computing applications in that they move beyond tabulating and calculating based on preconfigured rules and programs. Although they are capable of basic computing, they can also infer and even reason based on broad objectives.”
I specifically highlight the word “preconfigured.” That to me means the conscious application of a person or team’s perspective/understanding to the process. Most of the time that is very helpful. The perspective the team brings to the processes they are working on helps organize information or tag it so others can find or leverage it more effectively. Yet in years of creating information solutions based on a search, databases and other engines, I’ve learned that the ability to find the important content and data is limited due to the incomplete understanding on the part of future information seekers of “the voice of the author” (VOTA).
What is VOTA? It is the ability to understand the terms and relationships used by the author (the creator of a document, a database, a query, etc.) in relationship to all other information being analyzed. That is a major challenge we see in search-based systems today. Consider the context of any given search. We have the predefined rules—taxonomies, tags, dictionaries, synonym lists, etc. We have a more or less organized collection of documents created over years by many different authors, and we have the query terms used by the searcher. Within each of those core elements, we always find different usage and even definition of key terms—all this variation coming about because of the different relationships to the subject matter assumed by the authors.
Let’s look at a real-world example of the interaction of VOTA with that kind of discovery and retrieval environment. Consider, for example, a new chemical engineer who is looking for any reports or lab works done in the past on a chemical compound she needs to work on today. The system’s predefined rules and her terms miss the best documents available because the “authors” of those key documents used terms conceptually the same but not “known” to the system’s experts or the new chemist. That is a common shortcoming of “traditional” search tools, but here is a situation where adding machine learning algorithms to identify the latent relationships of terms and how they are used can now make the system adaptive and enable the chemist to leverage the best knowledge available.
E-discovery software has been on that path for years, ahead of most other segments, but it still has a long way to go before showing the kinds of inference and reasoning capabilities referred to in the definition of cognitive systems. Lawyers give the system examples of documents responsive to the case (e.g., project X) and also examples of documents non-responsive to the case (e.g., fantasy football emails). The machine learning engines find all similar documents (and emails/IMs, which today form the bulk of the documents) and perform other optimizations so the legal team can focus their time on what could be the smoking gun in the case. When the system can recommend what document potentially contains the smoking gun for a case, we cross the line to a true cognitive computing system.
Another example is the frustrating job of resume search. Most systems today are basically matching on keywords and using parsed fields (name, address, phone, etc.) to act as filters. Yet, what if the whole job description or resume could be used as a query and the VOTA was fully understood so that great candidates or jobs are the top results, not virtually random entries on a list? A new generation of job boards and application tracking systems (ATS) are leveraging machine learning add-ons to do just that. For example, your best developer just left. Yes, there is a job description but it would be easier to just hand the recruiter the resume of the person who left and say find me more people like this person. And the recruiter feeds the whole resume to the system to find other people with the same skills and experiences.
Update building blocks
We forget many times that in most of our existing information-based solutions, core context, relationships and language understandings are preconfigured. And those preconfigured “insights” have the biases, perspectives and current understanding of relationships of the experts that created them at the time. (Sad to say, in many companies those building blocks are not updated or maintained, and the users do not understand why the solution quality continues to slide.) That is what is core to human learning. When giving talks on conceptual understanding, I use the following example: What is the first word that comes to mind when you hear the word “coffee”? Turn to the person next to you and tell them. In most cases no one hears the same word but understands it right away. In a handful of cases, a simple explanation has the light bulb go on and the person’s conceptual understanding of “coffee” has been expanded. (I am still surprised how few people know what a French Press is.)
Having a key part of the cognitive computing solution leverage machine learning approaches is critical. Only by leaving behind the rules-based methods common today can cognitive solutions identify relationships in the terms used and recognize different words being used to convey the same concept in different content—understanding the VOTA, the ultimate translator. To be truly adaptive, interactive, iterative and contextual, key components of an overall solution must identify, leverage and expand “its” understanding of concepts so the solutions can really be a “partner” to humans.