Pairing text analytics with GPT and LLMs at KMWorld 2023
What is smarter than the new GPT/LLM-based AI? The combination of GPT-AI and text analytics. For all the success that GPT has accomplished in the public sphere, it has some limitations in the enterprise that text analytics can overcome—if done correctly.
Tom Reamy, chief knowledge architect and founder, KAPS Group, LLC, USA, and author of Deep Text, talked marrying text analytics with ChatGPT and AI during his KMWorld 2023 workshop, “Text Analytics & GPT/LLM in the Enterprise.”
GPT/LLM’s weaknesses include tendency to hallucinate, which means they make up false facts; GPT/LLMs were trained on public information, but the content and vocabularies behind the enterprise firewalls are quite different; transparency, understanding why it says what it does.
Every vendor has its own language, he explained. There is no overall arching rule for text analytics, however, there are a few ways to implement.
The text analytics foundation consists of orthogonal taxonomies with associated autocategorization and data extraction rules, sentiment analysis, and more.
“The way to think of this is, it’s a platform for building a lot of different applications,” Reamy said.
AI has gone through multiple hype-cycles before, he explained. There is something different about these new approaches though. Generative AI is based on large language models with billions of parameters.
“It’s autocomplete on steroids,” Reamy said. “There’s been a revolution in search, customer support, and an overall evolutionary change.”
GPT can solve problems in a way that seems similar to the human brain, Reamy noted, but that’s something that needs to be taken with a grain of salt. It can only mimic human language, it can not understand. Theres no notion causality, only correlation.
“There’s a difference between patterns and concepts. Patterns are what GPT builds on,” Reamy said. “Humans build concepts.”
The way to properly play with GPT is with prompt engineering. You can ask it to perform tasks, embed documents, add details or new programming languages, and more. The key is to provide as much context as possible with examples, multiple answers, lists of action, and more. Taxonomies and ontologies can be incorporated into GPT as well.
“The more complex the taxonomy or ontology, the better answer you’ll get,” Reamy said.
Machine learning/neural networks/deep learning is one approach to AI. The other is semantic AI which comprises of taxonomies and knowledge graphs, rules, and humans.
“With GenAI you need huge scales,” Reamy said. “That scale turns out to be what works so well.”
Instead of building your own language model, you can add to an existing LLM or prompt. Semi-automatic content curation with auto-categorization is the best way to build on existing LLMs.
“GenAI plus semantic AI is how you get better results,” Reamy said.
Text analytics has developed multiple fraud detection techniques, and they can be applied to GPT systems. Text analytics provides transparency into how LLMs answer the way they do, he said.
“GPT provides a general answer and then text analytics is the piece that provides precision to that,” Reamy said. “By adding text analytics, you get accuracy.”
Applications for this include:
- Customer support – automated chat, email responses
- Productivity tool for employees – idea generation, brainstorming
- Hiring/job descriptions
- Content creation – first draft, summaries, outlines
- SEO
- Search – tagging, summarization, answers, not just lists of links
- And more
“Text analytics and GPT are both best thought of as infrastructure, not a single application,” Reamy said. “It’s a foundation for building applications. You get more applications with more value.”
KMWorld returned to the J.W. Marriott in Washington D.C. on November 6-9, with pre-conference workshops held on November 6.
KMWorld 2023 is a part of a unique program of five co-located conferences, which also includes Enterprise Search & Discovery, Enterprise AI World, Taxonomy Boot Camp, and Text Analytics Forum.