A month after detailing the custom-made Tensor Processing Units (TPUs) Google uses for machine learning workloads, the company has launched the second generation of its in-house chips.
While the old chips were useful for the relatively easier task of inferencing, this new family is also designed for training machine learning algorithms (creating as well as executing).
Your move, Nvidia
“Each of these new TPU devices delivers up to 180 teraflops of floating-point performance,” explained a blog post written by Jeff Dean, Google senior fellow, and Urs Hölzle, SVP of Google Cloud infrastructure.
“As powerful as these TPUs are on their own, though, we designed them to work even better together. Each TPU includes a custom high-speed network that allows us to build machine learning supercomputers we call ‘TPU pods.’
“A TPU pod contains 64 second-generation TPUs and provides up to 11.5 petaflops to accelerate the training of a single large machine learning model. That’s a lot of computation!”
The company claims that one of its large-scale translation models used to take a full day to train “on 32 of the best commercially-available GPUs,” but can now train to the same level of accuracy “in an afternoon using just one eighth of a TPU pod.”
Google said that it will bring the new TPUs to Google Compute Engine as Cloud TPUs, to be used in addition to existing hardware such as Skylake CPUs and Nvidia GPUs. Price tiers for the TPUs are yet to be announced.
1,000 of the new devices will be made available to machine learning researchers via the TensorFlow Research Cloud initiative. A total of 180 petaflops of raw compute power will be given to those willing to share their research on Google’s open source software library with the wider community.
“They should also be willing to share detailed feedback with Google to help us improve the TFRC program and the underlying Cloud TPU platform over time,” Google said.