-->

KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

The future of search: Conversational, semantic, and vectorized models

Article Featured Image

Question answering

The pinnacle of enterprise search is the capability to answer a question predicated on the information in particular datasets. Large language models (LLMs) used by OpenAI’s ChatGPT typify this capacity by readily traversing the findings of the internet to answer questions. The caveats for such approaches include:

♦ Bias: As Potter observed, “Large language models are biased on the nature of the training data that was used in the model itself.” This concern is horizontally applicable to the sole use of statistical models for natural language technology applications.

♦2021: The training data for ChatGPT’s LLM for question-answering ended in 2021. According to Franz CEO Jans Aasman, “You can ask questions about almost anything, and you’ll get answers that were true in 2021.”

Inaccuracies: Such models also have tendencies to simply go with the question or the prompt they’re given, resulting in what many have termed “hallucinations.” “GPT might fantasize a lot,” Aasman admitted. “There’s a lot of stuff we get back that we don’t know if it’s real.”

Broad searches: ChatGPT, as a front end to its LLM, has less applicability to corpuses of enterprise data. Even if organizations train it on those documents, “If you want to search over your own documents and you ask a question, it will look at every other document that it has in all its data,” Aasman pointed out. “That doesn’t work so well.”

That creates opportunities for companies to use other GPT versions and LLMs for enterprise search. Databricks, for example, has created a similar version that runs locally on organizations’ computers. 

Still, there’s no denying the merit of these models. It’s possible for them to generate an ontology that’s useful for semantic search and certain vector search approaches. LLMs can also generate rules for semantic search applications. When properly implemented, they come close to excelling at question-answering.

Certain applications of ChatGPT enable users to “ask a query; it looks at the query; it finds the important concepts in the query and does a Google search; it comes back with 10 results,” Aasman said. “Then ChatGPT or the plugin says, ‘Look at these 10 results. Read each of these, and then answer the question I originally asked.’” This process validates answers from LLMs by enabling them to use a consensus approach of current information on the web (which circumvents ChatGPT’s 2-year training data gap) for responses. “This will be the new search,” Aasman predicted.

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues