-->

KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

Using Conceptual Search to Discover the Needle in the Haystack, By Engenium Corporation

In today’s digital age, statistics reflect that more than 90% of new corporate data is created electronically, and that 40% of that data is never converted to paper.1 This deluge of corporate data raises serious issues about storage, accessibility and compliance, particularly in situations involving litigation and accurately responding to discovery requests. These issues concern not only the volume of electronic data and where it is stored—they also concern determining what information will be valuable for a client’s defense, particularly how to handle hot potatoes in the discovery process.

Numerous examples exist of cases won or lost on the discovery of a single word or phrase that resided in the e-mail system. The commonly used EDD tools help to quickly and cost-effectively manage the discovery process by locating the obvious matches in electronic data, but few are helpful in identifying other not-so-obvious, but relevant documents. EDD systems that cannot recognize ambiguity in data leave their client companies completely exposed to failure in the electronic discovery process. As a consequence, finding a technology that helps these firms/legal departments in dealing with the ambiguity and mass amount of electronic data in litigation is huge.

Most of these tools provide a search capacity that is based on a keyword searching function. As many of us know, keyword searching is limited in efficacy by the user’s knowledge. Until now, computers have not been able to enhance the user’s querying capacity beyond the keyword search function to understand the meaning and context that is created when words are used in special combinations with other words.

Electronic Discovery is about much more than finding a series of expected keywords in documents. True Electronic Discovery is about uncovering the hidden relationships and connections that give meaning to words and phrases. Currently there are attempts to manually create these contextual relationships between words by augmenting keyword searches with query helpers in the form of lexicons, synonym lists and thesauri. With the rapid growth of information, and subsequently knowledge, it is simply impossible to manually update these systems at the speed of information.

There is a new generation of search technology, conceptual search, which is able to query various document types (i.e. MS Word, e-mails, PDFs, etc.) based on concepts rather than keywords. This technology combined with keyword and structured search will likely reduce a legal team’s exposure while significantly shortening the time it takes to discover the needle in the haystack.

Conceptual Search

The introduction of conceptual search technology to electronic discovery has expanded the view of a document from being the sum total of its keywords to a virtual map of words and phrases that make up concepts, ideas and patterns, as a result creating an associative, almost “human-like” understanding of the documents. This understanding is then leveraged to discover the needle in the haystack. Conceptual search technologies are typically based on high-level mathematical algorithms that allow for unparalleled speed and accuracy of search results. Natural derivatives of these technologies that also assist in electronic discovery are categorization and duplication detection, both of which are greatly enhanced by superior search technologies.

The benefits of conceptual search for legal teams and EDD providers are:

  • Reduced Risk—Because conceptual search can identify documents that would have been overlooked by traditional keyword search, legal teams can be confident in the thoroughness of the discovery process.

  • No Query Language or Syntax—Because conceptual search operates on concepts not keywords, searches can retrieve relevant documents even when those documents do not share any words with the query. There is no special query language required; simply “copy and paste” a sentence, paragraph or entire document to discover conceptually related documents.

  • Time/Cost Savings—Conceptual search can dramatically reduce the time needed to winnow large volumes of data; reduced cycle times imply reduced costs.

Law firms and corporate legal departments must be prepared to handle the electronic data explosion. Federal Rule of Civil Procedure 37, the Sarbanes-Oxley Act and other laws require it. Their EDD companies and consultants should be equipped to assure that the discovery deadlines are met and uncompromised. For all participants, the power of conceptual search combined with keyword and structured search will be essential in fulfilling these duties.


Engenium Corporation—creator of award-winning Semetric™, a next-generation conceptual search technology—enables organizations to rapidly search and retrieve structured and unstructured information. Engenium Semetric’s simple, embedded design powers new and existing applications and is currently deployed in numerous defense, intelligence, and commercial applications. Engenium Semetric’s unparalleled accuracy, high performance, and ease of deployment has been recognized and selected by organizations such as Fios, Inc., Northrop Grumman, Korn/Ferry, The U.S. Department of Defense, and many others.

1 Locate Smoking Guns Electronically, Brian Ingram, Law Technology News, September 29, 200

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues