Roundtable discussion—Enterprise search, Part 2
Part 2 of the roundtable continues a discussion of the Minnesota North Star Portal (see KMWorld February 2005 for Part 1) and looks ahead to the future of enterprise search. Led by KMWorld senior writer Judith Lamont, the roundtable included Eileen Quam, information architect at the Minnesota Office of Technology; Hadley Reynolds, VP and director of research at Delphi Group; and Andy Feit, senior VP of marketing at Verity.
Lamont: The North Star portal has over 500 Web sites within it, many of which are not hosted on the portal but are independent. Are you moving toward a standard "look and feel" in the portal to provide a more consistent user experience?
Quam: Yes, we are going in that direction. The agencies are allowed to migrate to the portal solution for free, so a few of them have taken that step. We have created templates that look like the portal, for agencies to download and use. We also may have upcoming legislation that will require a more common look and feel to the various agencies' Web sites.
Lamont: In what other ways are you tailoring the portal to enhance the user experience?
Quam: Many sites have a service called Ask a Librarian, and I'm instituting it for the state because when people search for things and can't find them, they get angry and send angry messages to the Webmaster about the site. So I am offering them subject matter librarian experts who know how to research and have tools outside of what is available from the search engine spider. People who are not familiar with a topic may not know what terms to use. If they get to a person, they can communicate and clarify.
Feit: We see that trend in the corporate environment. Through the 90s, the idea was to put all resources online and let the users search by themselves, and we don't need librarians and we will close down the corporate libraries. What people eventually realized was that the users are not always at their most productive when they are doing searches. If you put them in touch with an expert in the area of finding information, who knows where to look inside the company as well as outside, it can help. There is almost a revival of the corporate library coming back in a new form, which consists of smart online-literate people who know how to research no matter where it might be. Those are the same people who can help you build that taxonomy but also can be there for the research chemist at Dow or DuPont who is looking for something and doesn't know where to start.
Lamont: So the best combination is well organized information plus expertise where it's needed?
Reynolds: The primary argument for looking at an enterprise search architecture is to organize metadata in a way that can be spread throughout the organization and be consistent. But other factors should be considered. Looking at the total user experience is important, because the risk here is basically a bad user experience if citizens can't get the information they need. Since providing citizens with information is one of the major missions of government, the risk can be high.
Lamont: What kind of feedback do you get from users? And how do you connect that to modifications in the search technology or the portal in general?
Quam: When our brainstorming group met to discuss the improvements we could make to the North Star themes (our major categories that are listed across the top of the portal), there was an interest in focus groups, surveys and usability studies that I had not seen before. I think over time, people are starting to "get it" with respect to portal design, information architecture and usability.
Lamont: Do you track online movement of users via clickstream analysis?
Quam: We have WebTrends for analytics, and all the theme managers responsible for topics on the site receive a report for their theme, so they can see what kind of traffic it gets and where people are going. Like log file analysis, the WebTrends analysis is limited in its usefulness because you don't know the context that people are working in when they are clicking through. One hazard is that you can make assumptions about popularity that aren't necessarily accurate. Also, I don't follow a policy of automatically eliminating content just because it's rarely accessed. Sometimes we need the content to be available even if it's not used frequently.
Feit: Tools like WebTrends will tell a lot about how users are interacting with your site, but they aren't capable of seeing what is happening inside of your search engine necessarily, or inside of your taxonomy. That can be very important information. For example, what are your top queries that get back no results? Those are failed searches. Those searches indicate that you have users looking for information on a topic and nothing is available. If somebody types in "private aviation" and no matches come up, that might tell you that there is a need for a thesaurus entry between "private aviation" and "general aviation" to make the connection. Verity has done a lot of work with this kind of search analytics over the past year.
Lamont: What other indicators for action might be revealed by search analytics?
Feit: By monitoring what people are searching for and discovering when they are successful or not successful using search analytics, what are your top queries, what's gaining in popularity, you can go beyond the portal into actual decisions. If you see lots of people searching for skateboard parks, your parks and recreation people can learn that they have five times as many people searching for skateboard parks as before, and maybe the county needs to create more skateboard parks. The analyses can start contributing to broader areas than just content and search but helping you change a decision, both in business and government.
Lamont: What new capabilities are being incorporated into search?
Feit: Another interesting thing we've begun to see is the idea of entity extraction. That is, being able to pick up more than just the strings and words in a document, but actually be able to operate on concepts. For example, extracting from content the cities, counties and parts of a state that a document mentions even if it doesn't mention it exactly the same way. Parks in Northern California could be picked up by zip code even if the words "Northern California" do not show up in the text. Verity Extractor can pick up patterns in data that were not part of the original taxonomy. It lets users see information more in the form of answers, as opposed to just content that they have to search through on their own. Nobody wants to go back and tag thousands of documents that are sitting on North Star's 500-plus Web sites. Yet we still want to provide users a rich experience that is responsive to their needs.
Lamont: To what extent has entity extraction been applied in practice?
Reynolds: I think we have a very long way to go. The practice of entity extraction that Andy is talking about is really only beginning in a lot of firms, and its use, particularly relative to linking up with structured data searches, is going to be a major new practice for many firms. In the financial services industry, there are many potential applications, particularly in the area of compliance. Some of these new features will fit into the mix of applications that have not really been dreamed up yet. Businesspeople and IT people are just now developing a solid base of information so that they can begin to imagine these things, because a lot of people just don't know what the software can do yet.
Lamont: How is workflow being int