Google announces new TPUs, rolls out Nvidia H100 GPUs, as it adds generative AI-focused cloud services

TPU v5e doubles the performance of the company's in-house AI chip

Google announced a number of artificial intelligence-focused tools and services for its cloud platform.

The company said that the Cloud TPU v5e was now available in preview, and is the latest in its in-house Tensor Processing Unit. Compared to TPU v4, which was released back in 2021, Google says that the chip has up to two times faster training performance per dollar and up to 2.5 times the inference performance per dollar for large language models and generative AI models.

The new TPU will be available in eight different virtual machine configurations, going from one TPU chip to over 250 within a single slice. For those needing more compute, the company is rolling out 'Multislice,' a way to sale models to tens of thousands of TPU chips.

"Until now, training jobs using TPUs were limited to a single slice of TPU chips, capping the size of the largest jobs at a maximum slice size of 3,072 chips for TPU v4," Google's VP of ML, systems, and cloud AI Amin Vahdat and VP of compute and ML infrastructure Mark Lohmeyer said in a joint blog post.

"With Multislice, developers can scale workloads up to tens of thousands of chips over inter-chip interconnect (ICI) within a single pod, or across multiple pods over a data center network (DCN)."

Alongside the new TPUs, Google said that A3 virtual machines (VMs) will be generally available next month, featuring eight Nvidia H100 GPUs, dual 4th Gen Intel Xeon Scalable processors and 2TB of memory. The instances were originally announced in May, and can grow to 26,000 Nvidia H100 Hopper GPUs - although it's not clear how many H100s Google will have, given the ongoing GPU shortage.

The cloud company said that generative AI startup Anthropic was an early user of the new TPU v5e and A3 VMs. While Google invested $300m in the startup, it is also a vocal Amazon Web Services user.

"We’re excited to be working with Google Cloud, with whom we have been collaborating to efficiently train, deploy, and share our models," Tom Brown, Anthropic co-founder, said.

"We’re excited to be working with Google Cloud, with whom we have been collaborating to efficiently train, deploy and share our models... Google’s next-generation AI infrastructure powered by A3 and TPU v5e with Multislice will bring price-performance benefits for our workloads as we continue to build the next wave of AI.”

Google announces new TPUs, rolls out Nvidia H100 GPUs, as it adds generative AI-focused cloud services

More in The Compute, Storage & Networking Channel

Issue 52 - 3D printing a data center

Wave Photonics raises £4.5m to develop its integrated photonics technology

Episode Develop grid resilience by achieving power independence

Tags

Data Center Networking Trends 2025

Customer guide to data center decarbonization

Digital Twins for Data Centers

Are You Data and AI Ready?