NVIDIA Blackwell GPU Sets New Standard in AI Performance

29-08-2024 | By Jack Pollard

NVIDIA’s Blackwell GPU has made a powerful debut in the latest MLPerf Inference v4.1 benchmarks, setting a new standard in generative AI performance. The Blackwell platform delivered up to 4x more performance on the Llama 2 70B large language model (LLM) compared to its predecessor, the NVIDIA H100 Tensor Core GPU. This breakthrough underscores NVIDIA’s dominance in AI technology, marking a significant milestone for data centre innovation.

Unmatched Generative AI Performance

As enterprises rapidly adopt generative AI, the demand for robust data centre infrastructure is soaring. Training large language models is a complex task, but delivering real-time, LLM-powered services is even more challenging. NVIDIA’s results in the MLPerf Inference v4.1 benchmarks highlight its leadership in overcoming these challenges.

The Blackwell GPU’s debut showcased exceptional performance, particularly on the Llama 2 70B workload. Its success is driven by a second-generation Transformer Engine and FP4 Tensor Cores, which significantly boost the platform’s ability to handle complex AI tasks swiftly and efficiently.

NVIDIA H200 GPU: A Continued Leader

While Blackwell stole the spotlight, the NVIDIA H200 Tensor Core GPU also achieved remarkable results across all data centre benchmarks. It excelled in the new Mixtral 8x7B mixture of experts (MoE) LLM benchmark, which features 46.7 billion parameters. MoE models are increasingly popular for their efficiency and versatility, solidifying NVIDIA’s role in AI innovation.

The H200 GPU’s performance, coupled with MoE models, provides enterprises with powerful tools to deploy diverse AI applications quickly and effectively.

Expanding AI Capabilities with NVLink and NVSwitch

The rise of large language models necessitates powerful multi-GPU solutions. NVIDIA’s NVLink and NVSwitch technologies enable high-bandwidth communication between GPUs, based on the NVIDIA Hopper architecture. This ensures real-time, cost-effective large model inference, crucial for today’s AI applications.

The upcoming Blackwell platform is set to extend these capabilities, supporting larger NVLink domains with up to 72 GPUs, further advancing AI innovation.

Industry Support and Continuous Innovation

NVIDIA’s success in the latest MLPerf benchmarks was bolstered by strong submissions from partners like Cisco, Dell Technologies, and Hewlett Packard Enterprise. This highlights the widespread adoption and availability of NVIDIA’s cutting-edge AI platforms.

NVIDIA’s relentless commitment to software innovation is evident in the performance improvements across its platforms. The NVIDIA Triton Inference Server, part of the NVIDIA AI platform, matched the performance of bare-metal submissions in this round, showing that enterprises no longer have to choose between feature-rich AI servers and peak throughput performance.

Bringing AI to the Edge with NVIDIA Jetson

NVIDIA’s advancements extend beyond the data centre to the edge, where real-time insights are crucial. The NVIDIA Jetson platform achieved significant improvements in the latest MLPerf benchmarks, with the Jetson AGX Orin system-on-modules delivering over 6.2x throughput improvement on the GPT-J LLM workload. This allows developers to deploy versatile AI models that transform sensor data into actionable insights with unprecedented efficiency.

Conclusion: NVIDIA Leads the AI Revolution

The MLPerf Inference v4.1 benchmarks underscore NVIDIA’s unmatched leadership in AI performance, from data centres to the edge. The debut of the Blackwell GPU, along with the continued success of the H200 GPU, signals a new era of AI capabilities, empowering enterprises to innovate and scale like never before.

To learn more about NVIDIA’s latest AI advancements, visit the NVIDIA technical blog.

bio image.jpg

By Jack Pollard

Jack has spent over a decade in media within the electronics industry and is extremely passionate about working with companies to create interesting and educational content, from podcasts and video to written articles for engineers and buyers.