TEXT ANALYTICS gains clout to capture insights from the data maze
The potential of analytics
Part of the power of text analytics lies in its ability to integrate and relate diverse sets of unstructured information. “Text analytics is often used for sentiment analysis, and it is very effective for that,” says Tom Sabo, principal solutions architect at SAS, “but those analyses can be further enriched by relating them to information in public documents or events or to geolocational data.” For example, some solutions are designed to monitor geopolitical conflicts, and they can bring in news items from different geographical regions that may trigger an alert based on set rules or models developed from machine learning.
SAS University Edition is being used in academic and non-profit institutions to analyze studies on the problem of human trafficking. Researchers used data from the National Human Trafficking Resource Center (NHTRC) hotline call center, established as part of the Polaris Project, to identify socio-economic risk factors for human trafficking. Then they identified metropolitan areas where, based on those factors, the likelihood of human trafficking was high. Being able to focus resources on high-risk areas is expected to help mitigate the problem.
Another study evolved from reports on the topic published by the U.S. government. “The State Department has been producing reports on human trafficking for approximately 200 countries annually for years.” Sabo says. “Using visual text analytics, it was possible to extract patterns of recent and historical trafficking-related activity. Through those visualizations, we were able to see what countries were interconnected, as well as whether the countries involved were invested in addressing the problem. That ultimately provides decision-makers and humanitarian agencies with big-picture information about how to best combat human trafficking internationally.”
Text analytics extends value of customer reviews
Sentiment analysis is a relatively mature application for text analytics, but there is still room for more effective use. “In some cases, companies are just pulling out positive and negative words,” Reamy comments. “What really distinguishes companies is not just the sentiments they identify, but what they do with it afterward.” A number in isolation is less meaningful than a comparison of positive and negative ratings with those of a competitor; ideally, customer feedback should be proactively used to gain insight as to why a particular product receives certain ratings. Social media is the growth area for sentiment analysis, although it presents some unique challenges for analysis because of its structure and use of colloquial language.
Many organizations would like to have reviews of their products, especially positive ones, but getting customers to write them can be difficult. Yotpo was launched in 2011 to help companies generate reviews from their customers. It was able to help increase the number of reviews by a factor of nine, and the reviews also became more positive. The increase in responses was attributed to the fact that users could write reviews from any platform and to personalizing the requests for reviews.
Over time, Yotpo expanded to a user-generated content platform that also included photographs and other content. “We realized that there was more value in this content than our clients were accessing,” says Ophir Reshef, VP of product marketing and strategy at Yotpo. “We had started to get requests from our customers to extract information from the reviews, but it was difficult to do manually and to aggregate the information into summary form.
The company decided to develop a text analytics engine to accompany its review-generating application. “We had 30 million reviews to work with,” Reshef says. “Our team of experts in data science, big data and business intelligence worked for a year to build the infrastructure for the product.” Initially, Yotpo tried using generic data models, but they were not effective because unlike many other text documents, reviews are short and have very condensed content.
One feature of the product that Reshef considers to be unique is the ability to extract multiple topics from short amounts of text. “It is common that a customer may say one good thing and one bad thing in a review,” Reshef observes. “For example, they might say that they like the product but that delivery took too long. The output of the analyses is in three categories: topics of interest, sentiment analysis and clusters of topics.
“The result is a business dashboard of insights,” says Reshef, “which lets the users of our software understand aggregated feedback from customers. It tells them what topics are most important to customers (Is it the product’s quality, color, shipping issues and so forth?) and whether the customer was happy or unhappy about each of them. Users can then drill down into the data and determine the root cause of happiness or unhappiness.” In addition, users can break down data by geographical location and detect key opinions. “Each department can access the subset of information that is relevant to them,” Reshef explains. Customer care can focus on service, product development can look at content design and quality, and marketing can assess the effectiveness of its campaigns.