Server maker Gigabyte Technology has launched two liquid-cooled units designed for HPC and AI training.
The G262-ZL0 and G492-ZL2 use Nvidia HGX A100 accelerators and AMD Epyc 7003 processors, which can be run at their limits thanks to liquid cooling provided by specialist liquid cooling vendor CoolIT.
HPC and AI training both need high-density servers, which generate a lot of heat. Liquid cooling is becoming more widespread to remove that heat and prevent server downtime in dense data centers.
CoolIT makes cooling systems that replace CPU and GPU heatsinks with a cold-plate that carries a fluid circulation system, along with the plumbing to distribute coolant fluids in the racks.
Gigabyte says the use of the Nvidia HGX A100 platform is significant, as these chips use the new Nvidia Magnum IO GPUDirect technologies that offload more workloads from the CPU for a performance boost. The chips support direct data exchange between GPUs and third-party devices such as NICs or storage adapters. The systems also support GPUDirect Storage for a direct data path to move data from storage to GPU memory while offloading the CPU. For high-speed interconnects the servers incorporate Nvidia's NVLink, and uses NVSwitch for peer-to-peer communication.
The G262-ZL0 is a 2U GPU-centric server while the G492-ZL2 is a 4U GPU-centric server.
The new servers separate the GPU baseboard from other components including CPUs, RAM, and storage, in two chambers.