Enterprise GPU fleets idle at 5% utilization as infrastructure costs surge to 41% of budgets
Corporate AI deployments are running GPU clusters at just 5 percent capacity on average while total cost of ownership jumped seven points year-over-year, according to new enterprise survey data.

Companies that rushed to build GPU capacity after ChatGPT's debut are now sitting on hardware that runs idle 95 percent of the time, even as their total infrastructure costs have climbed from 34 percent to 41 percent of AI budgets.
The utilization figure comes from enterprise deployment data released this week, tracking organizations that bought large GPU fleets during the 2023–2024 compute crunch. The gap between capacity and actual workload has turned what was supposed to be a strategic advantage — owning the hardware outright — into a fixed-cost liability. Inference expenses and operational overhead now consume two-fifths of enterprise AI spending, up seven percentage points from the prior year.
The mismatch stems from uneven demand patterns that early buyers didn't anticipate. Enterprises provisioned for peak loads that rarely materialize, leaving clusters waiting for work most hours of the day. Scheduling, routing, and governance tooling hasn't caught up — most organizations still allocate GPUs manually or through coarse job queues that can't pack workloads efficiently. Energy access and cooling requirements add to the operational burden, especially for on-premise deployments that can't scale down when idle.
The trend suggests that the initial land-grab for compute capacity has given way to a harder optimization problem: how to actually use what you bought. Organizations that locked in hardware during the shortage now face the choice of improving orchestration software, consolidating workloads, or accepting that a large fraction of their capital sits dark.