News
OpenAI utilizes Cerebras infrastructure for Codex model inference
Friday, February 13, 2026 at 05:38 PM
OpenAI is reportedly utilizing Cerebras Systems infrastructure to run its Codex model, positioning the wafer-scale hardware as a viable alternative to NVIDIA for AI inference tasks.
Context
OpenAI has officially launched GPT-5.3-Codex-Spark, its first major model powered by Cerebras Systems hardware rather than Nvidia GPUs. Released on February 12, 2026, the new coding tool utilizes the Wafer Scale Engine 3 to achieve inference speeds exceeding 1,000 tokens per second. This represents a 15x performance leap over traditional GPU clusters, specifically targeting real-time developer workflows where latency has previously been a significant bottleneck for the Codex product line.
This rollout is the first milestone of a multi-year partnership valued at over $10 billion that will see OpenAI integrate 750 megawatts of Cerebras compute capacity through 2028. While Nvidia remains foundational for model training, OpenAI is aggressively diversifying its infrastructure to reduce vendor dependency and improve response times. For investors, this move validates Cerebras as a serious second option for AI inference, signaling a strategic shift toward specialized silicon for high-speed, real-time AI applications.
Related Companies
Nvidia
NVDA