Nvidia submits Blackwell Ultra performance results for MLPerf Inference v6.0 benchmarks
News

Nvidia submits Blackwell Ultra performance results for MLPerf Inference v6.0 benchmarks

Wednesday, April 1, 2026 at 09:35 PM

Nvidia has reportedly submitted benchmark results for its Blackwell Ultra hardware in the MLPerf Inference v6.0 suite, demonstrating performance leads in AI inference tasks compared to existing market solutions.

Context

On April 1, 2026, NVIDIA announced record-breaking results in the MLPerf Inference v6.0 benchmarks, led by its Blackwell Ultra architecture. Utilizing a rack-scale GB300 NVL72 system equipped with 288 Blackwell Ultra GPUs, the company set new system-level throughput records across frontier models like DeepSeek-R1 and Llama 3.1 405B. A combination of hardware enhancements and TensorRT-LLM software updates delivered up to a 2.7x performance gain on DeepSeek-R1 compared to previous cycles, achieving over 2.5 million tokens per second at scale. This performance milestone is significant as it demonstrates NVIDIA's ability to maintain a dominant lead over competitors such as AMD and Intel, who also released v6.0 results today. While AMD surpassed the 1-million-tokens-per-second threshold with its Instinct MI355X, NVIDIA's massive scale-out results underscore its vertical integration of silicon, software, and networking. These benchmarks serve as a critical industry signal for AI factory operators and cloud providers, like Nebius and Lambda, who are currently deploying this hardware to optimize the unit economics of large language model inference.

Related Companies

Nvidia
Nvidia
NVDA
US