-->

NEW EVENT: KM & AI Summit 2025, March 17 - 19 in beautiful Scottsdale, Arizona. Register Now! 

The expanding ease and utility of text analytics and natural language processing

Article Featured Image

The cardinal advantage of these purely statistical deep learning methods is that, via techniques such as representation learning and sophisticated word embeddings, organizations can implement them quickly without significant amounts of upfront work. This boon is redoubled via transfer learning approaches that minimize training data quantities. Nevertheless, there are three shortcomings of pure deep learning approaches to text analytics:

Dearth of knowledge: Although GPT-3 can sequence words together in the context of a particular domain, “it doesn’t know anything,” Aasman pointed out. “It doesn’t have a mental model, or memory, or sense of meaning.” Specifically, it lacks domain knowledge.

Omissions: According to Walckenaer, the analysis of text or the generation of natural language isn’t as thorough with this approach as it is with others. “If you feed these models that Q1 revenue is one billion, it may create a sentence saying the revenue is one billion, without stating it’s Q1,”
Walckenaer cautioned.

Unsubstantiated results: Because they're limited to what they've been trained on, purely statistical methods are prone to “create some sentences that invent stuff,” Walckenaer revealed. This issue can be acute for NLG deployments. “While working with banks, while working with pharma companies, while automating the writing of regulatory reports, you cannot afford these mistakes,” Walckenaer added. “It has to be perfect, auditable, and traceable.”

Rules-based systems

Granted, there are numerous use cases in which the imperfections of purely statistical techniques don't compromise business value. Segovia referenced their utility for “translation mechanisms in English that deliver that to another language.” Customer-facing chatbots are another example, along with forms of intelligent document processing (IDP) that render what Automation Anywhere CTO Prince Kohli called a “layout vocabulary that says for a tax form, it has these fields, this is what they look like, and with a high degree of confidence, we can extract what the person who wrote in it is trying to say.”

However, for mission-critical applications involving cognitive search, healthcare billing, and regulatory compliance, “You need a rule-based approach if you want to have the correct answer,” Aasman commented. “You cannot have 80% right in a contract. That would be a legal fest for lawyers.” Because it's predicated on rules, this non-statistical form of AI is inherently explainable. Consequently, multiple NLG vendors use this approach for crafting narratives “so that we can be absolutely sure that what we write is traceable and can be used with confidence in highly regulated industries,” Walckenaer remarked.

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues