News

Nvidia announces Nemotron 3 Super optimized for Blackwell architecture

Wednesday, March 11, 2026 at 11:59 PM

Nvidia has announced Nemotron 3 Super, a 120B parameter Hybrid SSM Latent Mixture of Experts (MoE) model specifically optimized for the Blackwell architecture, offering improved performance and speed over predecessors.

Context

Nvidia has officially launched Nemotron 3 Super, a 120 billion parameter model designed specifically to exploit the hardware advantages of the Blackwell architecture. As a hybrid Mixture-of-Experts (MoE) and Mamba-Transformer model, it activates only 12 billion parameters during inference to maintain high reasoning accuracy while drastically reducing compute costs. This release marks the first use of NVFP4 pretraining, a low-precision format that enables the model to achieve up to 4x faster inference on NVIDIA B200 systems compared to previous generation FP8 performance on Hopper GPUs. The model is strategically positioned for agentic AI workloads, featuring a massive 1 million token context window and Multi-Token Prediction (MTP) for accelerated speculative decoding. By open-sourcing the weights and training recipes, Nvidia is providing the software layer necessary to drive immediate enterprise adoption of its Blackwell hardware. Early integrations are already live on platforms like Perplexity, Together AI, and CoreWeave, targeting high-volume tasks such as automated code review and complex research orchestration.

Related Companies

Nvidia
Nvidia
NVDA
US