-->

KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

Why Geography Matters in Enterprise Search

While browsing through various historical maps featured in the "American Treasures" collection at the Library of Congress the other day, one thing became very evident: For hundreds of years, people have been making maps to pass along wide ranges of information critical to businesses, such as the location of international telegraph communication lines or representations of oceanic currents for international trade. Many of the world's greatest thinkers and innovators used maps to not only visualize their ideas and inventions, but to also provide information about key business concepts.

Considering the importance of information flow in today's business environment, I started thinking about the recent increase in the number of enterprise solutions that leverage digital maps. There are so many applications: identifying a disease outbreak, researching a real estate development, looking for the next great oil reserve—the list is enormous.

So, why does geography matter in the enterprise? For public and private entities, enterprise search represents an important tool for knowledge workers to make better, more efficient decisions. Knowledge workers need to know who, what, why, when, how, and where. They need to visualize pertinent information on a map in seconds, with great accuracy, from any work platform.

There's a problem: approximately 30% of most organizations' data and documents are not found in a typical search.

Finding important information in the enterprise encounters the same problem: unstructured content such as notes, articles and reports may not be accessible. Content stored in structured databases may be, but it represents a smaller fraction of what information exists in the enterprise.

According to the National Academy of Sciences, as much as 85% of content is unstructured. Furthermore, 80% of decisions made by knowledge workers in an enterprise are derived from information stored in unstructured format.

Many of the "silver bullets" of business intelligence reside in unstructured content. For example:
• A string of reports about walk-in patients to a free clinic in Staten Island that have all been diagnosed with a rare strand of influenza;
• An investigative reporter's story for a small Arizona newspaper that cites specific areas where migrants have crossed the border trying to enter the United States; and
• A thick digital file from a retired geoscientist which contains notes and reports on Alaskan North Slope crude and other key corporate knowledge that spanned his 30-year career.

These examples show that unstructured information is pertinent, but not always widely available and accessible with a search tool. They also demonstrate how location paints a better picture of a problem.

Location + Intelligence = Comprehensive View
Location, location, location—with information, information, information. 70% of all unstructured content contains a geographic reference. Visually representing unstructured content on a map further tells a story. In a knowledge-worker economy, location and information must fuse, but linking them requires enterprise technology that can accurately and logically search structured and unstructured data. This requires a way to geospatially link unstructured content such as documents to a map. This drives the emergence of a new focus in enterprise search: location-based search technology.

With geography becoming more and more important to mission-critical decisions—from national security and intelligence to finding where to drill for oil—this fusion of geographic search for unstructured content provides unmatched context and intelligence about any location.

Geographic text search (GTS) enables users to rapidly locate high-relevance unstructured and structured documents using both keywords and a map as a filter. Documents in various types and formats can be located based on their geographic references contained within the text. Knowledge workers can now answer the question "what do I have in my document collection that talks about this place?"

First, this requires a process called "geoarsing." Technically speaking, geoparsing is the process of applying natural language processing (NLP) on an unstructured text document to identify geographic references—explicit or implied. This goes beyond simple string matching. By considering all text in a document, a geographic text search system can consider contextual clues to more accurately determine the exact geographic reference and its location mentioned in a document. For example, the word "London" in a document might imply London, England. But it could also refer to "London broil." Traditional search technology is not always able to distinguish between such ambiguities.

For each geographic reference identified in a document, a set of latitude/longitude coordinates is assigned to the document. Confidence values are used to rank potential locations as defined by a gazetteer. The result is a comprehensive index of documents marked with one or more coordinates. Geographic text search relies on this index to supply accurate search results to users.

GTS systems, such as MetaCarta's, feature a full-text geographic metadata index, specially optimized for geographic queries. Solutions that supply geographic text search through integration with a duel index strategy produce very low performance query speeds. The solution is a highly specialized, efficient search index for extremely fast searches on simultaneous keyword and geographic queries.

Geographic text search solutions have emerged that enable knowledge workers to find accurate, relevant content in seconds using a map. The ability for any organization to leverage geography to better manage unstructured content creates intelligence that expedites work processes and organizational success. Location combined with intelligent geographic search technology is a harbinger to the knowledge-worker economy.


MetaCarta (www.metacarta.com) provides users with map-driven geographic search, geographic referencing and data visualization capabilities. MetaCarta products make data and unstructured content "location-aware," making that information geographically relevant. Founded by a team of MIT researchers in 1999, MetaCarta is privately held, with US headquarters in Cambridge, MA, and offices in Vienna, VA and Houston, TX. For more information, please visit www.metacarta.com.

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues