Oracle Cloud Infrastructure (OCI) has made Nvidia L40S GPU bare-metal instances available to its customers.

Announced in an Nvidia blog post, the instances are available to order and have been launched alongside plans for a new virtual machine accelerated by a single Nvidia H100 Tensor Core GPU.

nvidia-l40s-og-1200x630
Nvidia L40S – Nvidia

The L40S is a data center GPU designed for generative AI, graphics and video applications. It has fourth-generation tensor cores and can support the FP8 data format. According to Nvidia, a single L40S GPU (FP8) can generate up to 1.4x more tokens per second than a single Nvidia A100 Tensor Core GPU (FP16) for Llama 3 8B with Nvidia TensorRT-LLM at an input and output sequence length of 128.

OCI will offer the L40S GPUs in its BM.GPU.L40S.4 bare-metal compute offering which has four L40S GPUs, each with 48GB of GDDR6 memory. It also includes local NVMe drives with 7.38TB capacity, fourth generation Intel Xeon CPUs with 112 cores, and 1TB of system memory.

It is also available in the OCI Supercluster - which will have 800Gbps of internode bandwidth and low latency for up to 3,840 GPUs.

Among its early adopters is Beamr Cloud, a cloud-based video encoding service.

“We chose OCI AI infrastructure with bare-metal instances and Nvidia L40S GPUs for 30 percent more efficient video encoding,” said Sharon Carmel, CEO of Beamr Cloud. “Videos processed with Beamr Cloud on OCI will have up to 50 percent reduced storage and network bandwidth consumption, speeding up file transfers by 2x and increasing productivity for end users. Beamr will provide OCI customers with video AI workflows, preparing them for the future of video.”

The new VM with a single H100 GPU accelerator is set to "come soon", and will provide cost-effective and on-demand access for enterprises with generative AI and HPC workloads.

Plans for Oracle to offer the L40S were first revealed in September 2023, at the time with an anticipated availability date of early 2024.