Semiotics for enterprise search
Readers of KMWorld are comfortable with the semantic processes that knowledge management systems employ to make sense of business information. There are knowledgebases, taxonomies, controlled vocabularies and access control tags. Knowledge management is more than key word search and retrieval, although finding information is part of the discipline.
Many of the flagship companies offering knowledge management systems have wrapped a basic string matching system with various enhancements. Some systems process e-mail and use data from those analyses to pinpoint an individual who is a "hub" for information exchange in an organization. Other vendors' systems generate relationship maps, sometimes described as social graphs. With those maps, a manager can learn who is an expert in a particular topic based on the content flowing through the system. There are many variations, and organizations have experienced major successes with systems from different vendors. At the same time, other licensees of the same vendors' systems report less helpful results.
I had an opportunity to learn about a new approach to access information within an organization. Unlike most of the systems we have tested, Sophia Search, headquartered in Belfast, Northern Ireland, has enlisted the art and science of semiotics to make information access and management more useful. Collaborating with colleagues at the State University of St. Petersburg, Russia, Dr. David Patterson from the University of Ulster in Northern Ireland, developed a patented approach to information retrieval. He and his co-founder, Dr. Vladimir Dobrynin, established Sophia Search in 2007. The company is now competing in the enterprise software sector with the likes of Autonomy, Endeca, Exalead, Google, and Microsoft, among others.
Semiotics focuses on signs and symbols as indicators of meaning. As implemented, the approach enables "Sophia to understand and interpret the meaning and context of information within documents," Patterson said when I interviewed him in February. Sophia has tuned its approach so that the "meaning and relevancy of a document depends on both the user's query and, importantly, all the other documents within the organization," he added.
Patterson explained, "Sophia is based on a model of linguistics, called semiotics, which is the science behind how we as humans understand the meaning of information in context. This is the power behind the technology that drives our discovery engine and ability to improve the findability of information."
On the surface, Sophia works like Bing.com or Google.com or one of the big enterprise search solutions from Autonomy or Exalead. Sophia describes its approach as a "contextual discovery engine." The system processes the words in the source document and then automatically disambiguates the different means of words based on their context in the source document.
Autonomy's "meaning-based computing" and Recommind's approach appear to be similar to the Sophia system and method. However, Patterson pointed out that semiotics is the key to the firm's technology. Sophia searches by the meaning of what users are looking foras opposed to just the key words they use in their query. Sophia is designed to enable users to discover contextually relevant information of which they were previously unaware, and it increases users' understanding of their content.
Most information processing systems require the system administrator or a subject matter expert to tune the system. Patterson said, "One of the benefits of our technical approach is that Sophia operates without human guidance or training, and it does not require taxonomies, ontologies or thesauri."
According to Patterson, Sophia's approach is novel, possibly unique, in the world of information retrieval and content processing. "What motivated me was solving the problems that faced the world of search from a research perspective. I was aware of the limitations of some commercial systems through my own research and experience," he said. "The more I thought about the problems of locating the information I needed, I began to question the basic assumptions that conventional search vendors make."
The knowledge management application of Sophia is that the system can work with other enterprise software systems. If an organization has licensed a major vendor's search system, Sophia can enhance that system. It also works with the Google Search Appliance. Patterson said, "Sophia was developed with an open architecture to enable ease of integration through our Java APIs and RESTful Web services. In this way, we have made it easy to augment other search tools with Sophia's contextual capabilities and to build additional applications based on third-party products."
Crafting query, not so easy
The core of Sophia's approach is that the developers worked to avoid the pitfalls that have plagued other information retrieval systems. Those included a realization that the user knows what he or she is looking for before running a query. The Sophia team recognized that many users cannot express a specific information need. As a result, Sophia's developers wanted to provide a system that worked around forcing the user to craft a query.
He said, "Experience and research revealed that users often find creating a quality query a challenge."
The Sophia system features a search box, but the system also displays links to other relevant content. It depends on its patented semiotics approach. According to Patterson, "The system method organizes and presents information contextually, users spend less time sifting through irrelevant information and can focus on information that they know is of value."