Nvidia has announced an updated version of its GH200 'superchip,' which will be the first GPU in the world to include HBM3e memory.
The dual configuration delivers 'up to' 3.5× more memory capacity and 3× more bandwidth than the current HBM3 generation announced in May (and yet to launch). It comprises a single server with 144 Arm Neoverse cores, eight petaflops of AI performance, and 282GB of HBM3e memory.
“To meet surging demand for generative AI, data centers require accelerated computing platforms with specialized needs,” Jensen Huang, founder and CEO of Nvidia, said.
“The new GH200 Grace Hopper Superchip platform delivers this with exceptional memory technology and bandwidth to improve throughput, the ability to connect GPUs to aggregate performance without compromise, and a server design that can be easily deployed across the entire data center.”
Other than the memory upgrade, the GH200 GPU platform appears to be unchanged from the previous generation. HBM3e is the latest high-bandwidth memory format, with greater bandwidth and higher capacity 24GB stacks. That has allowed Nvidia to boost local GPU memory from 96GB per GPU to 144GB in a single configuration. That is necessary for large language models and other generative AI workloads.
The GH200 GPU with HBM3e provides up to 50 percent faster memory performance over the previous generation, and up to 10Tbps of bandwidth (5Tbps per chip).
The GPU designer said that it expects the new GPU to launch in Q2 of 2024, while the GH200 with just HBM3 is currently in full production and expected to launch by the end of the year. Both are compatible with Nvidia's MGX server specification.