Keep up with all of the essential KM news with a FREE subscription to KMWorld magazine. Find out more and subscribe today!

The expanding ease and utility of text analytics and natural language processing

Article Featured Image

The strategic gains of text analytics are myriad and, quite possibly, greater today than they've ever been before. The influx of advanced machine learning approaches impacting natural language technologies makes textual analysis more accessible to the enterprise than it was even 5 years ago.

Statistical model techniques also produce the benefit of accelerating tradi- tional natural language processing (NLP) methods to reduce their time-to-value. Conversely, pairing these conventional methods with their newer statistical counterparts heightens the accuracy of text analytics, which, in turn, increases the use cases for the full spectrum of natural language technologies.

Time-honored applications of sentiment analysis and contractual reviews are as prevalent as they ever were. There’s also an array of more modern deployments, including the automation of regulatory reports, spoken interfaces with front- and back-end IT systems, and generative text summaries of documents and visualizations.

Consequently, natural language technologies—including natural language understanding (NLU), natural language generation (NLG), and conversational AI—are embedded in everything from business intelligence (BI) solutions to the now ubiquitous remote conferencing options.

The expanding number of choices in this space means there’s also a burgeoning assortment of technological approaches to account for when selecting the right one for the enterprise. According to Franz CEO Jans Aasman, “Any technology having to do with text and unstructured text, that’s text analytics. Parsing a text is text analytics. But, doing entity extraction or computing a word embedding is text analytics, too.”

Understanding the implications of these different methods is critical for obtaining the accuracy, traceability, explainability, and time-to-insight needed by organizations to achieve any desired objective from text analytics.

Structured data

The applicability of natural language technologies to text analytics spans structured data, semi-structured data, and unstructured data. Nonetheless, depending on how it's configured, not every form of text analytics is suitable for each of these data types. The most immediately available form of NLP is often for structured data. Numerous BI vendors augment their relational analytics with natural language interfaces, enabling users to have conversational interactions with them. In these cases, the systems understand questions (via natural language querying) about relational data to deliver answers and "embellish them with more information that prompts a new question," explained Josh Good, Qlik VP of global product marketing.

The resulting text analytics applies to the questions asked (a form of unstructured data), not to the underlying source data, which requires additional analytics. Those analytics may be facilitated by BI vendors or natural language technology vendors partnering with them. When you see your data and note that sales are moving up, you’re likely to wonder about the cause. According to Emmanuel Walckenaer, Yseop CEO, “We can automatically analyze all the data and actually take the contributors and get some intelligence.” BI solutions imbued with NLG also produce textual explanations of visualizations.

Unstructured text

The more mature form of text analytics is performed on unstructured data sources such as documents, emails, and webpages. Advanced machine learning techniques have become popular for the rapidity with which they not only can understand this information, but also produce pertinent analyses and summaries of its import to business objectives. Methods involving transformers, Bidirectional Encoder Representations from Transformers (BERT) and, more recently, Generative Pre-trained Transformer 3 (GPT-3), are lauded for these capabilities—especially for their NLG propensities. They can provide “excellent exercises of summarization of conversations, sentiment analysis, keyword extraction, and things of that nature,” maintained Ignacio Segovia, Altimetrik head of product engineering.

KMWorld Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues