NVIDIA achieves language understanding for real-time conversational AI
NVIDIA has made a breakthrough in language understanding that allow businesses to engage more naturally with customers using real-time conversational AI.
NVIDIA's AI platform is the first to train one of the most advanced AI language models -- BERT -- in less than an hour and complete AI inference in just over 2 milliseconds.
This level of performance makes it possible for developers to use state-of-the-art language understanding for large-scale applications they can make available to hundreds of millions of consumers worldwide.
Early adopters of NVIDIA's performance advances include Microsoft and some of the world's most innovative startups, which are harnessing NVIDIA's platform to develop highly intuitive, immediately responsive language-based services for their customers.
Limited conversational AI services have existed for several years. But until this point, it has been extremely difficult for chatbots, intelligent personal assistants, and search engines to operate with human-level comprehension due to the inability to deploy extremely large AI models in real time.
NVIDIA has addressed this problem by adding key optimizations to its AI platform - achieving speed records in AI training and inference and building the largest language model of its kind to date.
Helping lead this new era, NVIDIA has fine-tuned its AI platform with key optimizations that have resulted in three new natural language understanding performance records:
- Fastest training: Running the large version of one of the world's most advanced AI language models -- Bidirectional Encoder Representations from Transformers (BERT) -- an NVIDIA DGX SuperPOD using 92 NVIDIA DGX-2H systems running 1,472 NVIDIA V100 GPUs slashed the typical training time for BERT-Large from several days to just 53 minutes. Additionally, NVIDIA trained BERT-Large on just one NVIDIA DGX-2 system in 2.8 days - demonstrating NVIDIA GPUs' scalability for conversational AI.
- Fastest inference: Using NVIDIA T4 GPUs running NVIDIA TensorRT™, NVIDIA performed inference on the BERT-Base SQuAD dataset in only 2.2 milliseconds - well under the 10-millisecond processing threshold for many real-time applications, and a sharp improvement from over 40 milliseconds measured with highly optimized CPU code.
- Largest model: With a focus on developers' ever-increasing need for larger models, NVIDIA Research built and trained the world's largest language model based on Transformers, the technology building block used for BERT and a growing number of other natural language AI models. NVIDIA's custom model, with 8.3 billion parameters, is 24 times the size of BERT-Large.
For more information about this news, visit www.nvidia.com.