Business Intelligence: The text analysis strategy
Business intelligence (BI) solutions typically offer the ability to analyze quantitative data and produce information that monitors business performance. The analyses may be summaries or drill downs that present details on subsets of data. More broadly, business intelligence can include any information, such as articles and reports, that offers insights into an industry or company (see sidebar). Usually, quantitative data and text information are considered separately, but now quantitative analysis is being paired with information in text form to achieve a deeper understanding than either can provide alone.
EDF Energy, a large U.K. energy company, provides power to a quarter of the U.K.'s population via its electricity distribution networks in London, the South East and the East of England. EDF Energy supplies gas and electricity to more than 5 million customers through its U.K. retail brand. The company offers about 60 different energy-related products and services, including a "Green Tariff" that lets customers choose renewable energy sources. Other services include installation and repair, as well as special programs for elderly and disabled customers. The company was using a business intelligence tool to carry out basic analyses but wanted to move into data mining and modeling of customer behavior.
"We wanted to be able to interrogate our databases to get more value out of the information we had," says Clifford Budge, customer insight manager at EDF Energy.
After evaluating several solutions, Budge selected Clementine, a predictive modeling tool from SPSS. The flexibility of Clementine was appealing. It uses a wide range of analyses, including regression, neural networking and decision trees. In addition, it has clustering tools that group customers according to multiple behavioral variables. "One feature that interested us quite a bit was a text mining module that can work in conjunction with the quantitative analyses," Budge adds.
The intended users of Clementine were not statistical analysts, but business sales and marketing analysts who have a good understanding of the business and the associated data.
"They needed an application that was reasonably easy to use," continues Budge, "to allow us to make decisions based on the data while incorporating the expertise of the staff." The modeling component lets the employees define analyses that predict which types of clusters are likely to buy which kinds of products. The marketing department can then target the campaigns in a more focused way, improving ROI. Being able to predict customer behavior is a key goal of those analyses. One model developed using Clementine was able to identify a group of customers that was three times more likely than average to develop bad debts early in their customer life cycle. The analysis would enable the company's debt teams to manage those customers differently--for example, offering products that limit debt risk like prepayment.
The text module is being used initially for reviewing and reclassifying customers into appropriate categories. For example, some accounts that were initially established as residential can later convert to business accounts, without notifying EDF Energy. By using the text analysis tool to locate words associated with businesses, such as "Ltd.," EDF Energy can find those customers and then offer them products and services that match their business needs. In the future, the company plans to mine data in its customer relationship management (CRM) systems for indications of attitudinal reactions to products and to identify gaps in services.
"We can take data from any source," says Budge, "including our Siebel system, and compare our findings with insight from the Customer Research Team." If an indication is found that customers would like a certain product, a model can be created that profiles those customers. An offering can then be made to a wider range of similar customers.
"Traditional BI is about reporting the facts," says Olivier Jouve, VP of market strategy at SPSS, "but text mining explains more about why things are happening." Jouve developed the technology that allows analyses of structured and text data to be combined.
"So much of the available data is unstructured," Jouve adds, "that a lot is lost if it is not included in analyses." Conversely, text analysis conducted alone, without being related to quantitative analysis, cannot demonstrate an ROI.
Text mining is a bottom-up approach that starts with the data, to see what it shows. Search is a top-down approach, most useful when the researcher has a direction for the inquiry. Trying to find the key words to detect customer sentiment can be difficult. Searching call center notes, for example, may not be particularly revealing.
"This information typically consists of very short sentences, without a lot of context," says Jouve. "We've seen ‘customer' abbreviated in 27 ways." Text mining of customer data is geared toward extracting opinions and sentiments that may not be known in advance.
Medical data is another environment in which combined analyses of structured and unstructured data can prove fruitful. At the University of Louisville, a team of researchers headed by Dr. Patricia Cerrito is using SAS Text Miner from SAS to analyze data from area hospitals. Analyses of text records gathered from medication orders and chart notes are helping to explain the relationship between physician practices and patient outcomes. The ability of Text Miner to find patterns in clinical