Text analytics: versatile and growing

This article appears in the issue July/August 2013, [Vol 22, Issue 7]
Page 1 of 3 next >>


   Bookmark and Share

Text analytics is a versatile technology that is still being discovered by organizations that need to make sense of large volumes of unstructured text. It is being used in a wide variety of applications across numerous industries.

UnitedHealthcare provides healthcare insurance to 70 million subscribers and has a staff of about 80,000. As is typical of the information-rich healthcare industry, the company has a plethora of unstructured text that ranges from notes about customer interactions to medical documentation. To gain as much insight as possible from its information storehouse, UnitedHealthcare has launched several text analytics initiatives over the past few years.

The company selected SAS TextMiner as its text analytics platform. "We are a SAS shop and use its BI solution," says Mark Pitts, analytics director at UnitedHealthcare, "so TextMiner was a good fit." TextMiner has the functional capabilities that UnitedHealthcare wanted-such as entity extraction, parsing and stemming-and also has a high-performance analytics component. "This capability allows us to analyze billions of rows of text in a short time, model and analyze it," Pitts explains. "Given the increasing importance of big data in analytics, we felt this was a significant asset."

Notes taken by customer service representatives (CSRs) during conversations with customers provide one source of information. UnitedHealthcare analyzes the notes to determine what issues customers are concerned about and how well those issues are being addressed. "Given the large amount of information we collect, it would be impossible to analyze it manually. With our software solution, we can look at every call and make sure it's being handled the way we expect it to be," Pitts says.

The software is trained to look for patterns that indicate potential opportunities such as the customer not understanding plan coverage. "Knowing the patterns that are indicators of these opportunities, we can get out in front of them and proactively intervene to ensure the customer is receiving the best possible service," Pitts says.

Four areas of activity

Another application of TextMiner that UnitedHealthcare is developing will allow the company to examine healthcare records data, which includes an extensive amount of medical information. "We can analyze the text in doctors' and nurses' notes," Pitts says, "which will allow us to explore how the treatments affect outcomes."

Rather than group the analyses by procedure code or other structured category, UnitedHealthcare uses text as a common denominator. "This approach is a form of fuzzy search," he explains. "Individuals may have treatment patterns that are not identical in terms of diagnosis or treatment codes, but are very similar." Grouping those similar treatment patterns together can help determine how the best outcomes for patients were achieved.

Fiona McNeill, head of global product marketing for SAS, says, "We see four leading areas of activity in text analytics: emerging issues and adverse events, root cause analysis, predictive analytics and content intelligence." A typical application for emerging issues is illustrated by a company that provides advisories to organizations in the food chain. "This company brings together scientific and medical literature, information from media outlets and other sources," she says, "and extracts from this huge body of information the 10 percent that needs to be looked at more carefully."

Root cause analysis and predictive analytics often go hand in hand. "Warranty information is a good indicator of product problems," McNeill says, "and text analytics can determine the cause. By detecting the problems quickly, companies can also start anticipating—and thereby avoiding—the emergence of this problem for other customers." In additional cases, the results of the analysis can help pinpoint the most effective solutions that other support service staff can use.

Dealing with volume

The great volume of unstructured information now being produced is both an asset and a liability. "If we can understand the content, it will give us valuable information, but the analytics engine needs to search through the noise in millions of pages to find the signals that can be mined for their value," says Seth Earley, CEO of Earley Associates, a consulting firm that provides services related to content management.

The ability of text analytics to classify information is at the heart of its value proposition. "Search tools that auto-categorize information are using text analytics," Earley says. "Processes such as entity extraction, clustering and pattern detection reveal higher-order information that people were not using because they were not aware of it."

Text analytics is a vital tool in coping with big data. Earley explains, "Data is being generated from so many sources now, both in structured form such as sales figures and from sources such as social media channels in which consumers are communicating their reactions to products." Although quantitative data tells what is happening, understanding why it happens generally requires insights from unstructured information.

If a website visitor cannot find a product or an answer to a question, for example, the root cause could relate to a number of different factors. "Text analytics can determine whether the issue is a navigation problem, a content problem, a support problem or a product problem," Earley says. "Organizations now have an unprecedented opportunity to make sense of information almost in real time."

Audio analytics

Customer phone conversations are unstructured content that can be analyzed either directly or after conversion to text. "More than 50 percent of customer interactions are carried out via telephone," says Daniel Ziv, VP of voice of the customer analytics for Verint, which provides enterprise work force optimization and voice of the customer analytics solutions. "The content is very rich and clean compared to social media, and relevant because nearly everyone is either a customer or prospective customer."

As with text analytics, speech analytics for audio has certain target words to look for, but presents some additional complexities. "Audio does not have punctuation, and has a lot more repetition," says Ziv. "But companies can be selective in what they look for and get significant value." A few key words like "ridiculous" or "you people" are red flags that allow a company to build a predictive category of customers at risk. "One of the best predictors of customer churn," Ziv says, "are calls from customers who say they are terminating their service."

Page 1 of 3 next >>

Search KMWorld

Connect