The Next Generation of Knowledge Management
Natural Language Question-Answering
In many instances, the information retrieval that’s synonymous with enterprise search is morphing into answering natural language questions in kind. Via the RAG approach, which involves language models searching vector embeddings of any modality, organizations can consolidate disparate sources of knowledge “and have a complete, coherent, centralized knowledge management system that we can deploy toward new goals, like universal question-answering, or knowledge question-and-answer systems that live in Slack or Teams,” said Matt Lake, Pega’s senior director of the customer service product. There are several dimensions of RAG that are relevant to question-answering, including these:
• Feedback Mechanisms: One of the emergent applications of RAG is to improve the data curation, metadata descriptions, and even contents of the enterprise knowledge housed within vector stores. “What we’re hearing is that RAG-based systems are a great tool for exposing issues in the underlying knowledge management systems,” Lake explained. “When you get a wrong answer, one that doesn’t make sense, or isn’t complete, the first instinct is, ‘This isn’t very smart,’ or, ‘This isn’t getting it.’ But what’s extremely common is that the actual reason for the bad answer lies in the underlying document, which isn’t correct or complete.” By providing this information to organizations in real-time, RAG systems can spur them to correct documents to improve their knowledgebases.
• Automation Opportunities: By auditing RAG systems, what users are prompting them with, and the quality of the model’s responses, organizations can identify areas for automation to further democratize access to enterprise knowledge. “If you’re doing Q&A and returning answers to a customer service representative, we can look to see how often users are using that answer and how often it’s been successful for them,” Lake commented. “That’s a nice feedback mechanism to determine this is something we can start to answer automatically in a chatbot in the future.”
• Context Engineering: Although it’s typically equated with GenAI deployments, RAG is just one of the many forms of prompt augmentation and requires prompt engineering to get optimal results. The very notion of prompt engineering is rapidly being replaced by the notion of context engineering, which expands the context sent to language models. Context engineering partly arose because of perceived limitations to traditional RAG. “If you take all your legal documents and put them in a vector store, you can ask questions, but how can you be sure it picks out the right contracts?” Aasman asked. “You’ve got 1,000 contracts; how does it know which contract to pick?
• Shortfalls: The finite amount of information input into a prompt, or user question, has also dampened more recent perceptions of RAG. According to Michael Allen, Laserfiche CEO, “The downside of RAG is you’re only getting snippets of your corpus and putting that in the prompt. You’re potentially omitting important information that the LLM might use to improve the output’s quality.”
Increasing Context
A relatively recent advancement that potentially rectifies some of RAG’s shortcomings is the fact that both larger frontier models, as well as smaller language models, have enlarged their input windows for prompts. For example, Allen said, when asking a model a question about information contained in a product or customer service manual, “Instead of doing RAG to pull out chunks or sections of it, you can just take the whole manual and feed it in.”