News

Nvidia reportedly cancels Rubin CPX to transition inference decoding to LPUs

Thursday, February 26, 2026 at 08:17 PM

Rumors suggest that Nvidia has canceled the Rubin CPX project, shifting to a strategy that offloads inference decoding to LPUs. This transition likely requires significant CUDA modifications and pre-compilation adjustments, positioning Rubin as a testbed for this new architecture.

Context

Recent reports suggest Nvidia has canceled its Rubin CPX GPU, a specialized processor originally unveiled in September 2025 for massive-context inference. Instead, the company is reportedly shifting inference decoding tasks to LPUs (Language Processing Units). This pivot follows Nvidia's December 2025 licensing agreement with Groq, which saw key members of the Groq team join Nvidia to scale their high-performance, low-cost inference technology. The move marks a significant architectural shift as Nvidia prepares the broader Rubin platform for its second-half 2026 release. By offloading decoding to dedicated LPUs, Nvidia aims to optimize the Rubin stack for the burgeoning "reasoning token" market. This strategy likely replaces the hardware-heavy CPX approach with a more flexible, software-integrated solution via Nvidia Dynamo and licensed Groq IP. While the change requires major CUDA modifications and pre-compilation, it is expected to further enhance the platform's target of a 10x reduction in inference token costs compared to the previous Blackwell generation.

Sources (7)

NVIDIA Kicks Off the Next Generation of AI With Rubin — Six New Chips, One Incredible AI Supercomputer | NVIDIA Newsroom Inside the NVIDIA Vera Rubin Platform: Six New Chips, One AI Supercomputer | NVIDIA Technical Blog Groq and Nvidia Enter Non-Exclusive Inference Technology Licensing Agreement to Accelerate AI Inference at Global Scale | Groq is fast, low cost inference.Discover AI Inference Solutions - NVIDIA First look at Nvidia's AI system Vera Rubin and how it beats Blackwell NVIDIA GTC 2025 - Built For Reasoning, Vera Rubin, Kyber, CPO, Dynamo Inference, Jensen Math, Feynman NVIDIA GTC 2026: Rubin GPU Specs, Keynote, and AI Chip Analysis

Related Companies

Nvidia

NVDA