The universe of search
Navigating through the complex enterprise search marketplace
By Andy Moore, KMWorld publisher
The recent and profound explosion in the enterprise search marketplace can be attributed to one word: "expectation."
Nearly every observer, analyst and vendor credits the ubiquity and success of the commercial Web search engines--and Google in particular--with the demand-drive for enterprise search.
"It's easier to find box scores from the 1957 World Series than it is to find last quarter's sales presentation in the enterprise. While Web search has gotten really good, enterprise search has stagnated," says Dave Girouard, general manager of Google's new Enterprise division.
In many practical ways, the enterprise search vendors are victims of success … someone else's success. Matching the expectations of the public Web experience to the demands of the enterprise has created a mini-industry of search vendors who are approaching the problem from every direction imaginable.
[By the way, just in case you're thinking that public search engines have dominated Web usage, think again. A study by HitWise (hitwise.com), a Web-traffic tracking firm, showed that visits to the top three public search sites--Google, Yahoo and MSN Search--accounted for only 5.5% of all Web traffic measured during the last week in May 2004. "Adult" sites accounted for 18.8% of total traffic during the same period.]
"Once Internet search became available, people expected that same level of availability" in their business lives, says Whit Andrews, research director for Gartner Research (gartner.com), and the author of Gartner's 2004 "Magic Quadrant" for Enterprise Search, released in May 2004. Andrews traces the lineage of the current enterprise landscape to the first era of the public search engines--circa 1997--when the AltaVistas and Inktomis quickly gave rise to the second tier of public engines such as Yahoo, Excite, Lycos, Open Text and ultimately Google, in about 2000.
"The problem then was perception--people thought that search engines just didn't work," explains Andrews. And they were right. "There used to be a nagging feeling that what you were looking for just wasn't there," he continues. "If you looked for something, you were lucky if it was there to be found." Now that's all different, and some of the public search engines' success can be attributed to the simple existence of greater amounts of content being made available.
The new problem is one of trust and certainty. Now, we know it's there … we just can't find it. And that uncertainty is driving the business demand for reliable search to this day.
Enterprise vs. public: the difference
No doubt about it: Enterprise search must meet the performance levels that users have come to expect on the Web. But they are not the same. For one thing, a search on the Web is expected to yield the "best" answer out of many possible answers. A search within your enterprise is supposed to provide the correct answer. Big difference.
Technical techniques for scaling and performance developed on the Web can be adapted to the enterprise, but many techniques for searching, organizing and mining information on the Web are simply not applicable to the enterprise.
Here's why: The overwhelming majority of information in an enterprise is unstructured--not existing in relational databases. That unstructured information exists in the form of HTML pages, documents in proprietary formats, forms, paper, e-mail and other media objects.
The structured stuff is where the value is perceived to be (because it drives revenue-creating business processes), and that's why Oracle, Microsoft and the ERP vendors such as SAP have become the behemoths they are. So the natural tendency is to treat unstructured content in a similar manner, which means imposing some "structure" onto the chaos. That's called classification and taxonomy, and in one way or another, those text-structuring concepts underlie all the efforts of the enterprise search vendors. (It also represents an expensive front-end investment during the capture, or acquisition, phase of information management and thus has its own set of drawbacks.)
Despite some commonalities, most of today's vendors have developed exotic variations, based on linguistic and semantic black magic, or have developed very specific vertical specialties. All those varieties and nuances among the vendor offerings result in a very messy shopping list for the business manager who simply wants to do his job better.
The analysts at Ovum call it "next-generation search," which they define this way: "The technologies and products that are bringing new levels of intelligence, order and personalization to the search process. Next-generation search technology is about providing more intelligent and more proactive tools to allow users to take control of the vast amounts of information that the networked world makes available. It overlaps with terms such as information retrieval, knowledge discovery and intelligent search, and it encompasses a wide variety of technologies and products."
It sure as heck does.
Decision time … good luck
The Gartner/Whit Andrews Magic Quadrant includes 20 vendors … with only three (Autonomy, Verity and FAST) in the "leader" square. And FAST is a new addition this year. Does that mean that buyers should consider only those vendors on their short list? Ha. You wish it were that easy.
"People call us up and ask ‘What's the best search engine?'" relates Andrews. "And we ask them ‘What are you doing with it?' And they say, ‘Why does that matter?'"
Of course, it DOES matter. Just as other software developers have identified certain vertical markets or functional specialties to focus on, the search engine vendors have carved themselves small niches within the overall "enterprise search" marketplace. Take e-commerce for example: Endeca and EasyAsk are two vendors identified with providing search tools specifically tuned for catalog-retail applications. An e-commerce short list should probably include them (even though a number of others including Mercado and InQuira and the "biggies" like FAST, Verity and Autonomy can lay claim to an e-commerce application, too).
A vendor assessment for, say, intelligence gathering for security applications would lean toward Convera or InXight, just as a call center/marketing department deployment would veer toward InQuira or Entopia or iPhrase. If you're looking to run search as a lower-cost desktop utility, ISYS will happily provide you a shrink-wrapped box that will serve your needs just fine. And then there's the Google Appliance, as it's called--a bright yellow box that plugs in and does it all, if you accept the manufacturer's claims. And many do.
Just because a vendor calls certain functionality a "specialty," very few want to limit themselves (and thus withdraw from a marketplace) by claiming exclusive domain expertise. "Nobody wants to say ‘That's all I do,'" explains Andrews, "but a lot of people want to say ‘That's what I do best.'"
To make matters more complicated, other players in the content management and knowledge management spaces regard their search functionality as a significant enough portion of their product that they insist on being referred to in search-vendor assessments. Some of those, when asked to separate the sales revenues attributed to search functionality, will confess that search alone isn't really a viable part of the business. But they have search, darnit, and have as much right to a market position as anyone else.
Well, this is my chance to dodge responsibility. Gartner's Magic Quadrant assessment is a good starting place for those in the market for search technology, and below is a brief synopsis of Andrews' market analysis. See the "Universe of search" sidebar for a more complete vendor sourcing.
Gartner's Enterprise Search MQ
Here's a synopsis (deeply truncated) of the new Gartner Magic Quadrant market analysis. If you are evaluating enterprise search vendors, I recommend you contact Gartner to read the report in its entirety:
Autonomy has a wide range of strong, relevancy determination models and develops new ways to collect and display search results. It is aggressively pursuing vertical markets, particularly for contact center support and compliance needs.
Fast Search & Transfer (FAST), now in the Leaders quadrant, has experienced explosive growth, providing better-than-average means and an expanding list of approaches to determining relevancy.
Verity's product range, in terms of variety of price and sophistication, surpasses the competition. Its purchase of forms vendor Cardiff Software and natural language processing and response vendor NativeMinds reveals Verity's ambition to burst free of the search label and emerge as a significant player in search-founded business process fusion applications.
Convera is the strongest vendor in the government sector and has improved in its commercial pursuits. No other vendor incorporates document and data classification and categorization into the search process more effectively; only Autonomy matches it in video and audio storage and search. Convera's scale and security capabilities also are beyond reproach.
EasyAsk moved to the Visionaries quadrant from the Niche Players quadrant on the strength of dramatic demonstration of architectural facility and ambition to expand well beyond its e-commerce base.
Endeca has expanded its ambition from that of "guided navigation" (its term for classification and category browsing, which it has masterfully turned into an "RFP requirement") to a wider field of generalist search and knowledge management. It has dramatically improved its relevancy capabilities and increased its commitment to general research-oriented search technologies.
Entopia's architecture is sophisticated, and its relevancy is differentiated mainly around user behavior and modeling. Incremental development related to security, document classification and plain language analysis should be next on its to-do list.
InQuira's focus on the acceptance of plain-language queries and examination of document contents and structure at a semantic level sets it apart from other self-service specialist vendors. InQuira is particularly effective in interactions among self-service projects and CRM applications.
iPhrase originated the notion of modeling responses beyond results lists for ordinary users. It has expended significant effort on providing connectors among enterprise applications and projects it originates, and provides particularly good support for managing numeric data in fields.
Kanisa acquired the Ask Jeeves enterprise search product and subsequently increased its efforts in the market for search-powered CRM and self-service solutions. Kanisa has invested in relevance oriented toward external novice users, intended to support e-commerce and product support.
Recommind provides significant flexibility in its relevancy determination models and depth in the use of user behavior to examine the models. Its greatest appeal is for KM-oriented installations where expert users' behavior may be leveraged to improve results relevancy.
Google's brand drives its search appliance to great popularity. Enterprises often include it on short lists (sometimes inappropriately) due to its brand strength.
Open Text (new to the Magic Quadrant) has reintroduced its search product as a standalone server. It is a credible alternative for risk-averse enterprises.
Hummingbird retains a substantial customer base from its acquisitions of early-generation search vendors. It focuses on selling a broader set of products that satisfies enterprises seeking a smart enterprise suite.
Intelliseek specializes in market and competitive intelligence.
ISYS has significantly extended its presence through sales in the law enforcement field. Consider it for internal installations where users may train themselves.
Mercado Software, formerly an e-commerce specialist, has introduced an enterprise search product that demonstrates better-than-credible architecture. Its results output capabilities and interoperability with other applications are strong.
Mondosoft is limited in its ability to index RDBMS data, but its relationship with Microsoft to address the Microsoft Content Management System has helped it grow. Appropriate for tactical installations, particularly where Microsoft products are exclusively used.
Thunderstone has a simple architecture, is priced reasonably and supports a broad variety of operating systems.
ZyLAB, another new addition to the Magic Quadrant, orients its efforts to law enforcement, compliance and litigation. One of its strengths is the ability to address scanned documents.
All right. So you have the "Universe of Search" list, and you have the Gartner Magic Quadrant. Do you feel any closer to making a final decision? Probably not.
Here's a decision-tree that will help narrow the field, and perhaps begin to feel comfortable with your short list:
Step one: Decide if you're interested in using an application service provider (ASP). That's an IT strategy decision that your IT guys have discussed ad nauseum and by now have developed a plan.
Step two: Operating system and/or application server, which is also an IT and architectural decision. Open? Microsoft?
The next step is a business decision, and will be determined by your understanding of the user experience that you want to achieve in your business. Do you prefer "guided navigation" or "natural language processing"? That basically means, do you want your users to be prompted through a point-and-click "treed" menu to arrive at the final answer? Or are you comfortable with your users simply typing a question into a "box?" And if so, do you want the analysis to occur on the corpus of information in your repository? Or do you mean natural-language analysis of the query, in which case you're asking the search engine to interpret what the user really meant when she typed in the question. Or are you looking at something in between?
These are not trivial, or easy, questions to answer. They go to the core of your business philosophy regarding the levels of automation vs. the level of human interaction you think is appropriate. But thinking hard about these questions, then grilling the vendor reps, will help you arrive at the short-list you desire and trust.
A directory of enterprise search vendors
Microsoft Site Server
Open Road Technologies
Thomson Scientific Thunderstone