News

Groq 3 LPU featuring Samsung memory and optimized ethernet latency expected to ship in late 2026

Monday, March 16, 2026 at 07:40 PM

Groq is preparing to ship its Groq 3 LPU in the second half of 2026, specifically targeting Q3. The hardware features an 8-way system design with 4GB of SRAM and utilizes a specialized ethernet mode to reduce latency by 50% for FP8 operations. The system is designed for decoding tasks and reportedly incorporates Samsung LP4X memory components.

Context

The upcoming Groq 3 LPU, scheduled to ship in late 2026, represents a significant expansion of the partnership between Groq and Samsung Electronics. The new inference accelerator will utilize Samsung LP4X memory and features an architecture optimized for 4GB of SRAM per 8-way system. This iteration is specifically designed for high-speed decoding, employing a special FP8 mode over Ethernet to reduce network latency by 50%. By shifting more control to the compiler for deterministic data flow, the Groq 3 aims to eliminate the non-deterministic bottlenecks common in traditional GPU clusters. This development follows the high-profile integration of Groq technology into NVIDIA's Rubin platform, signaling a shift toward hybrid architectures that prioritize agentic AI and multi-agent systems. For investors, the late 2026 launch timeline underscores a critical bridge between current-generation inference and the trillion-parameter models expected to dominate the market by 2027.

Sources (7)

NVIDIA Groq 3 LPX - AI Inference Accelerator NVIDIA Vera Rubin Opens Agentic AI Frontier | NVIDIA Newsroom Nvidia Groq 3 LPU and Groq LPX racks join Rubin platform at GTC — SRAM-packed accelerator boosts 'every layer of the AI model on every token' | Tom's Hardware What is a Language Processing Unit? | Groq is fast, low cost inference.GitHub - sreenathmmenon/llmswap: Universal AI CLI & Python SDK for 8+ providers (OpenAI, Claude, Gemini, Cohere, Perplexity, IBM watsonx, Groq, Together AI). Multi-provider chat, code generation, cost optimization, age-appropriate AI. Claude Code/Gemini CLI alternative with zero vendor lock-in.The Architecture of Groq's LPU - by Abhinav Upadhyay Groq's Deterministic Architecture is Rewriting the Physics of AI Inference | HackerNoon

Related Companies

Nvidia

NVDA