A potential shift in the nature of workloads will filter down to the wider data center industry, impacting how they are built and where they are located.
Bigger data centers, hotter racks
Digital Realty’s CEO Andy Power believes that generative AI will lead to “a monumental wave of demand.
“It's still new as to how it plays out in the data center industry, but it's definitely going to be large-scale demand. Just do the math on these quotes of spend and A100 chips and think about the gigawatts of power required for them.”
When he joined the business nearly eight years ago “we were moving from one to three megawatt IT suites, and we quickly went to six to eight, then tens,” he recalled. “I think the biggest building we built was 100MW over several years. And the biggest deals we'd sign were 50MW-type things. Now you're hearing some more deals in the hundreds of megawatts, and I've had preliminary conversations in the last handful of months where customers are saying ‘talk to me about a gigawatt.’”
For training AI models, Power believes that we’ll see a change from the traditional cloud approach which focuses on splitting up workloads across multiple regions while keeping it close to the end user.
“Given the intensity of compute, you can’t just break these up and patchwork them across many geographies or cities,” he said. At the same time, “you're not going to put this out in the middle of nowhere, because of the infrastructure and the data exchange.”
These facilities will still need close proximity to other data centers with more traditional data and workloads, but “the proximity and how close that AI workload needs to sit relative to cloud and data is still an unknown.”
He believes that it “will still be very major metro focused,” which will prove a challenge because “you’re going to need large swaths of contiguous land and power, but it’s harder and harder to find a contiguous gigawatt of power,” he said, pointing to transmission challenges in Virginia and elsewhere.
As for the data centers themselves, “plain and simple, it's gonna be a hotter environment, you're just going to put a lot more power-dense servers in and you're gonna need to innovate your existing footprints, and your design for new footprints,” he said.
“We've been innovating for our enterprise customers in terms of looking at liquid cooling. It's been quite niche and trial, to be honest with you,” he said. “We've also been doing co-design with our hyperscale customers, but those have been exceptions, not the norms. I think you're gonna see a preponderance of more norms.”
Specialized buildings
Moving forward, he believes that ”you'll have two buildings that will be right next to each other and one will be supporting hybrid cloud. And then you have another one next to it that is double or triple the size, with a different design, and a different cooling infrastructure, and a different power density.”
Amazon agrees that large AI models will need specialized facilities. “Training needs to be clustered, and you need to have really, really large and deep pools of a particular capacity,” AWS’ Chetan Kapoor said.
“The strategy that we have been executing over the last few years, and we're going to double down on, is that we're going to pick a few data centers that are tied to our main regions, like Northern Virginia (US-East-1) or Oregon (US-West-2) as an example, and build really large clusters with dedicated data centers. Not just with the raw compute, but also couple it with storage racks to actually support high-speed file systems.”
On the training side, the company will have specialized cluster deployments. “And you can imagine that we're going to rinse and repeat across GPUs and Trainium,” Kapoor said. “So there'll be dedicated data centers for H100 GPUs. And there'll be dedicated data centers for Trainium.”
Things will be different on the inference side, where it will be closer to the traditional cloud model. “The requests that we're seeing is that customers need multiple availability zones, they need support in multiple regions. That's where some of our core capability around scale and infrastructure for AWS really shines. A lot of these applications tend to be real-time in nature, so having the compute as close as possible to the user becomes super, super important.”
However, the company does not plan to follow the same dense server rack approach of its cloud competitors.
“Instead of packing in a lot of compute into a single rack, what we're trying to do is to build infrastructure that is scalable and deployable across multiple regions, and is as power-efficient as possible,” Kapoor said. “If you're trying to densely pack a lot of these servers, the cost is going to go up, because you'll have to come up with really expensive solutions to actually cool it.”
Google’s Vahdat agreed that we will see specific clusters for large-scale training, but noted that over the longer term it may not be as segmented. “The interesting question here is, what happens in a world where you're going to want to incrementally refine your models? I think that the line between training and serving will become somewhat more blurred than the way we do things right now.”
Comparing it to the early days of the Internet, where search indexing was handled by a few high-compute centers but is now spread across the world, he noted: “We blurred the line between training and serving. You're gonna see some of that moving forward with this.”
Where and how to build
While this new wave of workload risks leaving some businesses in its wake, Digital Realty’s CEO sees this moment as a “rising tide to raise all ships, coming as a third wave when the second and first still haven't really reached the shore.”
The first two waves were customers moving from on-prem to colocation, and then to cloud services delivered from hyperscale wholesale deployments.
That’s great news for the industry, but one that comes after years of the sector struggling to keep up. “Demand keeps out-running supply, [the industry] is bending over coughing at its knees because it's out of gas,” Power said. “The third wave of demand is not coming at a time that is fortuitous for it to be easy streets for growth.”
For all its hopes of solving or transcending the challenges of today, the growth of generative AI will be held back by the wider difficulties that have plagued the data center market - the problems of scale.
How can data center operators rapidly build out capacity at a faster and larger scale, consuming more power, land, and potentially water - ideally all while using renewable resources and not causing emissions to balloon?
“Power constraints in Northern Virginia, environmental concerns, moratoriums, nimbyism, supply chain problems, worker talent shortages, and so on,” Power listed the external problems.
“And that ignores the stuff that goes into the data centers that the customer owns and operates. A lot of these things are long lead times,” with GPUs currently hard for even hyperscalers to acquire, causing rationing.
“The economy has been running hot for many years now,” Power said, “And it's gonna take a while to replenish a lot of this infrastructure, bringing transmission lines into different areas. And it is a massive interwoven, governmental, local community effort.”
While AI researchers and chip designers face the scale challenges of parameter counts and memory allocation, data center builders and operators will have to overcome their own scaling bottlenecks to meet the demands of generative AI.
“We'll continue to see bigger milestones that will require us to have compute not become the deterrent for AI progress and more of an accelerant for it,” Microsoft’s Nidhi Chappell said. “Even just looking at the roadmap that I am working on right now, it's amazing, the scale is unprecedented. And it's completely required.”