On-Prem

Systems

Google unveils TPU v5p pods to accelerate AI training

Need a lot of compute? How does 8,960 TPUs sound?


Google revealed a performance-optimized version of its Tensor Processing Unit (TPU) called the v5p designed to reduce the time commitment associated with training large language models.

The chip builds on the TPU v5e announced earlier this year. But while that chip was billed as Google's most "cost-efficient" AI accelerator, its TPU v5p is designed to push more FLOPS and scale to even larger clusters.

Google has relied on its custom TPUs, which are essentially just big matrix math accelerators, for several years now to power the growing number of machine learning features baked into its web products like Gmail, Google Maps, and YouTube. More recently, however, Google has started opening its TPUs up to the public to run AI training and inference jobs.

According to Google the TPU v5p is its most powerful yet, capable of pushing 459 teraFLOPS of bfloat16 performance or 918 teraOPS of Int8. This is backed by 95GB of high bandwidth memory capable of transferring data at a speed of 2.76 TB/s.

As many as 8,960 v5p accelerators can be coupled together in a single pod using Google's 600 GB/s inter-chip interconnect to train models faster or at greater precision. For reference, that's 35x larger than was possible with the TPU v5e and more than twice as large as possible on TPU v4.

Google claims the new accelerator can train popular large language models like OpenAI's 175 billion parameter GPT3 1.9x faster using BF16 and up to 2.8x faster than its older TPU v4 parts — if you're willing to drop floating point for 8-bit integer calculations.

This greater performance and scalability does come at a cost. Each TPU v5p accelerator will run you $4.20 an hour, compared to $3.22 a hour for TPU v4 or $1.20 a hour for TPU v5e. So if you're not in a rush to train or refine your model, Google's efficiency-focus v5e chips still offer better bang for your buck.

Along with the new hardware, Google has introduced the concept of an "AI hypercomputer." The cloud provider describes it as a supercomputing architecture that employs a close knit system of hardware, software, ML frameworks, and consumption models.

"Traditional methods often tackle demanding AI workloads through piecemeal, component-level enhancements, which can lead to inefficiencies and bottlenecks," Mark Lohmeyer, VP of Google's compute and ML infrastructure division, explained in a blog post Wednesday. "In contrast, AI hypercomputer employs system-level codesign to boost efficiency and productivity across AI training, tuning, and serving."

In other words, a hypercomputer is a system in which any variable, hardware or software, that could lead to performance inefficiencies is controlled and optimized for.

Google's new hardware and AI supercomputing architecture debuted alongside Gemini, a multi-modal large language model capable of handling text, images, video, audio, and code.

Send us news
2 Comments

Don't be fooled: Google faked its Gemini AI voice demo

PLUS: The AI companies that will use AMD's latest GPUs, and more

Google launches Gemini AI systems, claims it's beating OpenAI and others - mostly

Gemini accepts text, images, audio, and video and comes in three flavors

Tech world forms AI Alliance to promote open and responsible AI

Everyone from Linux Foundation to NASA and Intel ... but some big names in AI are MIA

Creating a single AI-generated image needs as much power as charging your smartphone

PLUS: Microsoft to invest £2.5B in UK datacenters to power AI, and more

Google teases AlphaCode 2 – a code-generating AI revamped with Gemini

Don't worry, your developer jobs are safe … for now

Digital memories are disappearing and not even AI or Google can help

Technology allows us to keep more of our stuff than previously possible – but what use is it if we can't find it?

Trust us, says EU, our AI Act will make AI trustworthy by banning the nasty ones

Big Tech plays the 'this might hurt innovation' card for rules that bar predictive policing, workplace emotion assessments

Google's Project Ellman: Merging photo and search data to create digital twin chatbot

'This is a brainstorming concept a team is at the early stages of exploring'

Google goes geothermal to power some bitbarns

Search giant exploring more locations to squeeze watts from rocks

Now is a good time to buy memory because prices rise next year, Gartner predicts

To blame? The usual suspect – AI's appetite for chips

Microsoft partners with labor unions to shape and regulate AI

Redmond reassures AFL-CIO workers they won't be pushed out by technology

The AI everything show continues at AWS: Generate SQL from text, vector search, and more

Invisible watermarks on AI-generated images? Sure. But major tools in the stack matter most