News

Nvidia Chief Scientist Bill Dally says 90% of data center power is dedicated to AI inference

Friday, March 27, 2026 at 03:29 AM

Nvidia Chief Scientist Bill Dally stated that inference tasks are projected to account for 90% of power consumption within AI data centers, highlighting a significant shift in infrastructure demands towards operational deployment rather than just model training.

Context

During a conversation at GTC 2025, NVIDIA Chief Scientist Bill Dally revealed that 90% of the power consumed in data centers is now dedicated to AI inference. This shift highlights a massive transition in the semiconductor landscape, where the energy demand for running deployed AI models has far surpassed the power required for initial model training. This trend is driving NVIDIA to prioritize energy efficiency and specialized hardware architectures to sustain the rapid scaling of global AI infrastructure. The energy intensity of inference underscores the urgency for the next-generation Rubin platform, which is expected to arrive in the second half of 2026. As NVIDIA and partners like OpenAI plan massive 10-gigawatt data center deployments, managing this power consumption becomes a critical bottleneck. Dally’s insights suggest that future GPU designs, including the upcoming Vera Rubin NVL 144, will be fundamentally shaped by the need to lower the total cost of ownership through radical improvements in inference-per-watt performance.

Related Companies

Nvidia
Nvidia
NVDA
US