News

Nvidia Chief Scientist Bill Dally says 90% of data center power is dedicated to AI inference

Friday, March 27, 2026 at 03:29 AM

Nvidia Chief Scientist Bill Dally stated that inference tasks are projected to account for 90% of power consumption within AI data centers, highlighting a significant shift in infrastructure demands towards operational deployment rather than just model training.

Context

During a conversation at GTC 2025, NVIDIA Chief Scientist Bill Dally revealed that 90% of the power consumed in data centers is now dedicated to AI inference. This shift highlights a massive transition in the semiconductor landscape, where the energy demand for running deployed AI models has far surpassed the power required for initial model training. This trend is driving NVIDIA to prioritize energy efficiency and specialized hardware architectures to sustain the rapid scaling of global AI infrastructure. The energy intensity of inference underscores the urgency for the next-generation Rubin platform, which is expected to arrive in the second half of 2026. As NVIDIA and partners like OpenAI plan massive 10-gigawatt data center deployments, managing this power consumption becomes a critical bottleneck. Dally’s insights suggest that future GPU designs, including the upcoming Vera Rubin NVL 144, will be fundamentally shaped by the need to lower the total cost of ownership through radical improvements in inference-per-watt performance.

Sources (6)

Frontiers of AI and Computing: A Conversation With Yann LeCun and Bill Dally S73208 | GTC 2025 | NVIDIA On-Demand GTC 2025 – Announcements and Live Updates - NVIDIA Blog OpenAI and NVIDIA Announce Strategic Partnership to Deploy 10 Gigawatts of NVIDIA Systems | NVIDIA Newsroom [PDF] Special issue dedicated to #Scifm25 the 2nd Conference on ...Inside NVIDIA’s Blackwell Architecture: A CTO's Guide to Next-Gen AI Infrastructure - Fixstars Corporation Tech Blog NVIDIA Roadmap 2024-2028: From Blackwell to Rubin and AI ...

Related Companies

Nvidia

NVDA