Amazon Web Services will be the first cloud company to offer Nvidia’s GH200 Grace Hopper Superchips on its service.
The company will offer the combined CPU and GPU over Amazon Elastic Compute Cloud (Amazon EC2) instances connected with Amazon’s EFA networking. It will be supported by the AWS Nitro System virtualization and Amazon EC2 UltraClusters hyperscale services.
AWS will also offer the GH200 NVL32 chips over Nvidia DGX Cloud, Nvidia's own service that it runs on top of other cloud providers.
Hyperscalers essentially lease Nvidia's servers, deploying them as a cloud within their cloud that Nvidia can market and sell to enterprises looking for large GPU supercomputers.
Google, Microsoft, and Oracle previously announced they would support DGX Cloud, but AWS reportedly held off. The other hyperscalers have not promoted the DGX Cloud service on their own websites.
“What makes this DGX Cloud announcement special is that this will be the first DGX Cloud powered by Nvidia's Grace Hopper,” Ian Buck, Nvidia's VP of hyperscale and HPC, said.
“It is a new rack-scale GPU architecture for the era of generative AI."
The GH200 with the NVL32 rack architecture provides the largest shared memory in a single instance on a cloud service, Nvidia said, supporting large language models that can reach beyond 1 trillion parameters.
At the same AWS re:Invent conference, both companies announced 'Project Ceiba,' a plan to build the world’s largest cloud AI supercomputer on AWS to be used by Nvidia's internal teams.