Nvidia reportedly cancels Rubin CPX to transition inference decoding to LPUs
News

Nvidia reportedly cancels Rubin CPX to transition inference decoding to LPUs

Thursday, February 26, 2026 at 08:17 PM

Rumors suggest that Nvidia has canceled the Rubin CPX project, shifting to a strategy that offloads inference decoding to LPUs. This transition likely requires significant CUDA modifications and pre-compilation adjustments, positioning Rubin as a testbed for this new architecture.

Context

Recent reports suggest Nvidia has canceled its Rubin CPX GPU, a specialized processor originally unveiled in September 2025 for massive-context inference. Instead, the company is reportedly shifting inference decoding tasks to LPUs (Language Processing Units). This pivot follows Nvidia's December 2025 licensing agreement with Groq, which saw key members of the Groq team join Nvidia to scale their high-performance, low-cost inference technology. The move marks a significant architectural shift as Nvidia prepares the broader Rubin platform for its second-half 2026 release. By offloading decoding to dedicated LPUs, Nvidia aims to optimize the Rubin stack for the burgeoning "reasoning token" market. This strategy likely replaces the hardware-heavy CPX approach with a more flexible, software-integrated solution via Nvidia Dynamo and licensed Groq IP. While the change requires major CUDA modifications and pre-compilation, it is expected to further enhance the platform's target of a 10x reduction in inference token costs compared to the previous Blackwell generation.

Related Companies

Nvidia
Nvidia
NVDA
US