-->

KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

Databricks creates suite of RAG tools for LLM app production

Databricks is debuting a suite of Retrieval-Augmented-Generation (RAG) tools to help users build high-quality, production LLM apps using their enterprise data.

RAG has quickly emerged as a powerful way to incorporate proprietary, real-time data into Large Language Model (LLM) applications, according to the company.

To achieve high quality with RAG applications, developers need rich tools for understanding the quality of their data and model outputs, along with an underlying platform that lets them combine and optimize all aspects of the RAG process.

RAG involves many components such as data preparation, retrieval models, language models (either SaaS or open source), ranking and post-processing pipelines, prompt engineering, and training models on custom enterprise data.

This release includes Public Preview of:

  • A vector search service to power semantic search on existing tables in your lakehouse.
  • Online feature and function serving to make structured context available to RAG apps.
  • Fully managed foundation models providing pay-per-token base LLMs.
  • A flexible quality monitoring interface to observe production performance of RAG apps.
  • A set of LLM development tools to compare and evaluate various LLMs.

With this release, Databricks natively supports serving and indexing data for online retrieval. For unstructured data (text, images, and video), Vector Search will automatically index and serve data from Delta tables, making them accessible via semantic similarity search for RAG applications.

Under the hood, Vector Search manages failures, handles retries, and optimizes batch sizes to provide the best performance, throughput, and cost.

For structured data, Feature and Function Serving provides millisecond-scale queries of contextual data such as user or account data, that enterprises often want to inject into prompts to customize them based on user information.

Unity Catalog automatically tracks lineage between the offline and online copies of served datasets, making debugging data quality issues much easier. It also consistently enforces access controls settings between online and offline datasets, meaning enterprises can better audit and control who is seeing sensitive proprietary information.

With this release, Databricks now offers a unified environment for LLM development and evaluation–providing a consistent set of tools across model families on a cloud-agnostic platform. Databricks users can access leading models from Azure OpenAI Service, AWS Bedrock and Anthropic, open source models such as Llama 2 and MPT, or customers’ fine-tuned, fully custom models.

Databricks is also releasing Foundation Model API’s, a fully managed set of LLM models including the popular Llama and MPT model families. Foundation Model API’s can be used on a pay-per-token basis, drastically reducing cost and increasing flexibility, according to the company.

Included in this release, Lakehouse Monitoring provides a fully managed quality monitoring solution for RAG applications. Lakehouse Monitoring can automatically scan application outputs for toxic, hallucinated, or otherwise unsafe content.

This data can then feed dashboards, alerts, or other downstream data pipelines for subsequent actioning. Since monitoring is integrated with the lineage of datasets and models, developers can quickly diagnose errors related to e.g., stale data pipelines or models that have unexpectedly changed behavior.

For more information about these updates, visit www.databricks.com.

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues