News

Groq 3 LPU featuring Samsung memory and optimized ethernet latency expected to ship in late 2026

Monday, March 16, 2026 at 07:40 PM

Groq is preparing to ship its Groq 3 LPU in the second half of 2026, specifically targeting Q3. The hardware features an 8-way system design with 4GB of SRAM and utilizes a specialized ethernet mode to reduce latency by 50% for FP8 operations. The system is designed for decoding tasks and reportedly incorporates Samsung LP4X memory components.

Context

The upcoming Groq 3 LPU, scheduled to ship in late 2026, represents a significant expansion of the partnership between Groq and Samsung Electronics. The new inference accelerator will utilize Samsung LP4X memory and features an architecture optimized for 4GB of SRAM per 8-way system. This iteration is specifically designed for high-speed decoding, employing a special FP8 mode over Ethernet to reduce network latency by 50%. By shifting more control to the compiler for deterministic data flow, the Groq 3 aims to eliminate the non-deterministic bottlenecks common in traditional GPU clusters. This development follows the high-profile integration of Groq technology into NVIDIA's Rubin platform, signaling a shift toward hybrid architectures that prioritize agentic AI and multi-agent systems. For investors, the late 2026 launch timeline underscores a critical bridge between current-generation inference and the trillion-parameter models expected to dominate the market by 2027.

Related Companies

Nvidia
Nvidia
NVDA
US