Search evolves to solve business problems
Concern about security is sometimes an obstacle to the implementation of search technology. “We know of one company that turned off its search engine because it was getting to unsecure content,” Monarko says. “Without the proper controls, search can violate privacy regulations. Velocity works with the enterprise’s security system, including having control at the document or even at the sub-document level if a given document has different levels of access within it.”
Open source option
Open source software (OSS) is predicted to grow at more than 20 percent per year, according to IDC (idc.com), and has made inroads into many knowledge management markets, including enterprise content management, business process management and business intelligence. LucidWorks Enterprise from Lucid Imagination is an enterprise search platform built on open source Lucene/Solr search technology from Apache. LucidWorks Enterprise is free for development and test; production deployments require a subscription through which a variety of support options are available.
“LucidWorks Enterprise is an open source search platform that can be used out of the box,” says Marc Krellenstein, founder and CTO of Lucid Imagination, “and we provide ongoing support and customization for those organizations that want that service.” The cost of support is comparable to the cost of supporting proprietary products. “The advantage of using open source products comes strongly into play for large applications where licensing fees might become prohibitive for proprietary products,” Krellenstein says. “Considering that LucidWorks measures up to the proprietary products in terms of scalability and accuracy, the cost-effectiveness is a strong benefit.”
The maximum benefits of deploying search technology can best be achieved by careful consideration of the organization’s needs and goals. Although many organizations have search software in place, not everyone is satisfied with its performance, and that is sometimes the result of not having carried out a requirements analysis.
“Even though search is to some degree commoditized,” Krellenstein says, “there is not usually a single right answer to a search. It depends on the data and the user. For any given application, you can segment the data to get a better answer, but the downside is then you have more silos. Finding the right balance can be a challenge.”
Planning for search
One of the biggest obstacles to a successful search implementation is the failure to consider how all the parts of the enterprise work together. “It’s very important to have a knowledgeable project administrator who sees the big picture when you are searching across many repositories,” says Lynda Moulton, senior analyst and consultant at Outsell’s Gilbane Group. “These should be people who know not only the content, but also know the organization’s staff and business processes.”
An early stage in deploying a search application should include an overall assessment of content. “Text analytics can be valuable not just for business problem solving but also for obtaining an overview of information in the enterprise,” Moulton says. “A comprehensive linguistic statistical analysis shows immediately what words and concepts appear frequently in the documents, which can help start a thesaurus and a taxonomy.”
Organizations that were most successful, Moulton reports, are the ones that also went on to slice and dice information across the functional areas to identify specific content and learn how it was being used. That process should be ongoing, to keep pace with the organization as it changes over time. To work well, a semantic search engine with auto-categorization requires a current thesaurus of terminology and concept relationships that are meaningful to the enterprise in which it functions.
The explosion in the volume of enterprise content has had an impact on organizations’ need for search, and also on their need to analyze the content more thoroughly to make sense of it. Longtime search software vendor ISYS developed many documents filters over the years, and decided to offer them as an OEM product to other software companies.
“Document filters identify the file type, identify and extract metadata, and then extract the text itself for deep inspection and indexing,” says Dave Haucke, VP of marketing at ISYS.
Equivio provides analytic solutions for e-discovery. It incorporated ISYS Document Filters into its product suite to enable high-performance text extraction. “The cost of document review is so high that law firms and corporate law departments want to include just the relevant ones,” Haucke says, “but they also need to be sure they don’t miss anything critical.” ISYS Document Filters allows extraction of text from several hundred file formats and types.
Sybase also includes ISYS Document Filters as an available component for its Sybase IQ business intelligence solution. Sybase uses the filters to ingest text from unstructured documents. From there, the customer performs analysis using Sybase’s tools. The extracted text can then be used in Sybase’s applications for e-discovery, fraud detection and forensic analysis.