An increasing need for semantics

Short messages, e-mails and time shifted message threads were problematic. Among the reasons listed by the speakers was unusual spelling, such as "gr8" for great. The challenge was that short strings were often ambiguous and difficult to disambiguate. "Disambiguate" is a $5 word for the work required to "figure out what something means." The problem was often one of "space." That is, the form factor of mobile devices imposes constraints on messaging and document creation. The other constraint was "time." Since mobile devices were often used without the person sitting in a fixed location, users omitted certain information due to an interruption or the need to squeeze in a message between meetings.

Meeting the challenge

I spoke with Luca Scagliarini, one of the senior executives of Expert System (expertsystem.net), a search and content processing development company headquartered in Italy. Expert System has developed COGITO. Here's how Scagarelli described the technology to me: "Our COGITO technology enables conceptual and natural language-based search and provides higher precision and recall in automatic categorization and concept extraction compared to traditional keyword, statistic or shallow linguistic based technologies."

That type of technology is "semantic"-that is, the system uses various methods to figure out what content means. As important as that function is for traditional search and retrieval, semantic technology seems to be a must-have when it comes to the type of content that is a natural consequence of the Enterprise 2.0 trend.

Exalead, a division of Dassault Systemes, offers its Voxalead and Tweepz services to showcase Exalead's technology for handling the new types of content. The Voxalead service converts audio or video to text and then indexes that content. One interesting feature of the Exalead rich media implementation is that the result list provides a link directly to the point in the video where the search word or phrase appears. The Exalead Tweepz service allows you to find and discover interesting people on Twitter.

Autonomy, Google, IBM, Microsoft and other companies have demonstrated technologies that can make sense of the new types of content appearing in organizations.

But organizations change slowly. The larger the organization, the more difficult it is to keep search synchronized with the needs of the users. My view is that semantic technology can play a significant role in providing context and functionality to queries for new types of content. The types of technology available from Exalead can make the information in a compound document available for search, retrieval and text mining.

The hurdles may be organizational, managerial and financial. Cloud services like those provided by Salesforce.com and Amazon may be better placed to process and make available to an enterprise the "information" locked in those new message types and hybrid content containers. Most companies that try to implement this type of next-generation semantic and rich media content processing system may be unable to deploy the system quickly. The Amazons and the Salesforce.coms would then become increasingly important in the enterprise content processing sector.

The reasons range from economies of scale to the ability to meet the needs of a distributed work force. We might be entering a transition period during which the types of enterprise content processing performed on premises is sharply defined-for example, litigation support or specific business intelligence functions. Broader types of communication services and the content generated within those services will reside in the cloud. One thing is certain, the shift from text to compound content is taking place. 

