To cope with the glut of information bogging down their knowledge workers, organizations have turned to large-scale search platforms. But knowledge workers, who rely on information to perform their jobs, continue to struggle to find relevant information at point of need. They often waste a great deal of time and energy inefficiently searching the vast amount of information available to them—both inside and outside of the organization. IDC estimates the cost of this time at more than $14,000 per worker per year and the cost of futile searching by these same workers at more than $5,000 per worker per year.1
While many search solutions are available that make finding information easier, these solutions can also introduce security issues. For example, individuals searching corporate file shares may find themselves with access to information they were never intended to have, such as drug formularies, business plans, personal identity information, health records, financial data, contracts, legal agreements and others. The consequences of this inappropriate access can range from the harmless to the disastrous—and can include theft, extortion, stolen intellectual property, lawsuits, regulatory sanctions, lost revenue or even damage to the corporate brand.
The challenge of access versus security is complex—organizations are struggling with the need to keep information accessible to those who need it while protecting it from being compromised. To keep pace, organizations need to provide their knowledge workers with a simple and scalable way to find and access content across multiple information repositories, while preserving critical authentication and security mechanisms to manage that access.
Approaches to Search
The search market has responded to this challenge with several very different solutions to the access dilemma—some better than others. In the absence of industry-wide standards, these solutions offer different linguistic standards, access protocols, terminology, and query structures. The security that each of these solutions provides also varies.
Traditional enterprise search. Traditional standalone search solutions crawl information sources to create a static, master index comprised of the information sources’ individual indexes. This approach creates problems and limitations. First, index information from the application layer must be duplicated in order to create the master index. This master index, consisting of duplicate data, usually becomes quite large very quickly, requiring significant IT resources to manage the burgeoning master index and increasing costs. Also, the master index does not instantly reflect changes made to the application index, resulting in incomplete and/or outdated search results. Most importantly, to minimize latency and optimize performance, these solutions bypass security permissions residing at the application layer, creating substantial security risks.
Federated search. Federated search solutions leverage the indexes and security mechanisms that reside at the source level, providing complete, up-to-the-minute search results—without compromising security. Compared to the traditional approach, a federated approach is more secure, more effective, much quicker to implement and far less expensive to maintain, delivering a faster return on investment without compromising security.
The security advantages of the federated approach over traditional enterprise search approaches are many. The federated approach can work with multiple security schemas and reuses application indexes. Federation relies on application-specific adapters to support a wide variety of security and authentication mechanisms. If the adapter framework is extensible, which requires that it be standards-based, then the framework’s software development kit (SDK) will enable the rapid development of adapters to fit new applications. With federated secure search, there is no security protocol emulation, no reindexing of information, and therefore no duplication of indexed content. Finally, federation enables IT departments to deploy a secure search solution without modifying or upgrading applications.
By contrast, a search platform that takes a crawling approach "breaks" the security model of each application it indexes in order to generate its master full-text index. The solution recreates—or tries to match as best it can—the security model for each object in the master index in order to ensure that the search solution is not utilized as a mechanism to bypass security controls within individual business applications. However, such solutions do not name application-level users along with their associated security permissions, allowing users to access information that they would not have permission to access at the application level. This is a significant security gap.
In addition, the index of a traditional search platform does not instantly reflect changes made to application-level security permissions. Such security updates—regardless of their importance—usually are adopted by the search platform on a scheduled basis, providing additional opportunities for inappropriate access.
Secure Search—What to Look For
Within the market of search solutions there is much variance on the issues of security, access and cost. Adopting the right solution is critical. When choosing a secure search solution, both the overall architecture of the search technology and specific features should be considered.
Reuse of existing assets. Choosing a solution that reuses existing assets is important to the solution’s success in the enterprise. The solution should reuse the native indexes of each source being searched rather than duplicating security permissions into a master index, which unnecessarily increases regulatory and legal risk. The solution also should leverage the built-in search feature of every application being queried so search results are always current and reflect even the most recent changes made at the source.
From a security standpoint, the solution should respect the application-based security permissions and return results that are appropriate for a user based on his or her role and associated security permissions at the time of the query. In other words, the solution should return results that are identical to those the user would have generated by connecting directly to and searching the information source.
From an efficiency perspective, make sure the solution provides a single sign-on (SSO) API that integrates seamlessly into the enterprise security infrastructure. The solution should pass authentication information to appropriate sites as part of the query, avoiding repetitive login steps. Beware of search tools that claim to have SSO, but that accomplish the same result by passing cookies, as this method raises security threats.
Source and content adaptation. Comprehensive federated search solutions will include a large number of pre-packaged adapters for various information sources and the ability to build new ones. Customization of these core adapters or the creation of additional adapters for packaged or customized applications should be able to be easily accomplished by an organization’s IT staff and should not require significant add-on services from a vendor or third party.
It is also important that search solutions have the ability to adapt to content, context and structural changes because information sources often change the way they present their results pages. The solution should be able to identify and classify all changes without human intervention, and repair itself in order to perform the correct extraction of metadata.
Content enablement. A search solution is only as good as the results it provides. But just as important as the results is the way the results are presented to the end users. A comprehensive solution will de-duplicate results, rank them by relevancy and cluster them automatically and dynamically, based on metadata. Solutions also should have the ability to perform cross-lingual queries—in other words, queries are formulated using the preferred language and the search solution translates the query into other languages on the fly, based on the source requirements before performing the search.