AMD launches ROCm 6.2; adds FP8 support and enhanced AI training and inferencing capabilities

AMD has launched ROCm 6.2, the latest version of the company’s open source software stack.

First launched in 2016, ROCm consists of drivers, development tools, compilers, libraries, and APIs to support programming for generative AI and HPC applications on AMD GPUs.

In a blog post, AMD said ROCm 6.2 offers five key enhancements over previous versions, including extended virtual Large Language Model (vLLM) support, enhanced AI training and inferencing on AMD Instinct accelerators, and broader FP8 support.

This updated FP8 support means that ROCm 6.2 can be run on AMD’s MI300A and MI300X accelerators. Launched in early December 2023, the MI300-series is designed to train and run large language models. AMD claims the chips are the highest-performance accelerators in the world for generative AI. The MI300X offers 2.6 petaflops of FP8 performance, which AMD claims exceeds the speed of Nvidia's H100 chips.

The expanded FP8 support ecosystem on ROCm 6.2 includes a transformer engine with FP8 GEMM support in PyTorch and JAX via HipBLASLt; vLLM integration; JAX and Flax support via XLA; RCCL and MIOPEN support; and standardized FP8 headers across libraries to simplify development and integration.

Bitsandbytes quantization library support has also been added to the platform, which the company says “revolutionizes AI development” by “significantly boosting memory efficiency and performance” on AMD Instinct accelerators.

The ROCm installation process has also been simplified via the addition of an Offline Installer Creator, reducing the installation tool to a single installer file and automating post-installation tasks.

Finally, new performance monitoring and optimization tools – dubbed Omniperf and Omnitrace – have been rolled out by AMD. Although still in beta, Omniperf delivers GPU kernel analysis for fine-tuning, while Omnitrace provides a holistic view of system performance across CPUs, GPUs, NICs, and network fabrics.

AMD CPU demand takes chunk out of data center market

Last month, AMD revealed that during the second quarter of 2024, the company had sold more than $1 billion worth of its MI300X AI chips, leading CEO Dr. Lisa Su to upgrade its data center GPU revenue forecast from $4 billion to $4.5bn for 2024.

Following Su’s comments that sales of the company’s Nvidia alternative chips had been “higher than expected,” new research from Mercury Research shows that the company’s CPU product segment has also experienced healthy growth over the past five years.

According to the report, demand for AMD’s Epyc CPUs has seen the company’s market share of data center sales grow from 2.9 percent in 2019 to 24.1 percent in 2024. Furthermore, year-over-year growth during the previous 12 months hit 6.6 percent, which Mercury attributed to the expansion of cloud deployments and increasing enterprise momentum.

AMD launches ROCm 6.2; adds FP8 support and enhanced AI training and inferencing capabilities

AMD CPU demand takes chunk out of data center market

More in IT Hardware & Semiconductors

Chinese institutions acquired Nvidia AI chips despite US export ban - report

TSMC says Arizona fab is now ahead of schedule; signs semiconductor talent agreement with Kyushu University

Episode Streamlining network deployments for Enterprise AI

Tags

The Water-Energy Nexus in the age of AI

Hyperscale data centers require 24/7 reliability

Why does a data center need a digital twin?

Building an End-to-End Data Infrastructure for the AI Era