On-Prem

HPC

Los Alamos to power up supercomputer using all-Nvidia CPU, GPU Superchips

HPE-built system to be used by Uncle Sam for material science, renewables, and more


Nvidia will reveal more details about its Venado supercomputer project today at the International Supercomputing Conference in Hamburg, Germany.

Venado is hoped to be the first in a wave of high-performance computers that use an all-Nvidia architecture, in this case using Grace-Hopper Superchips that combine CPU and GPU dies, and Grace CPU-only Superchips.

This supercomputer "will be the first system deployed not just with Grace-Hopper in terms of the converged Superchip but it’ll also have a cluster of Grace CPU-only Superchip modules,” Dion Harris, Nvidia’s head of datacenter product marketing for HPC, AI, and Magnum IO, said during an Nvidia press conference ahead of ISC.

Built in collaboration with Hewlett Packard Enterprise (HPE) for Los Alamos National Laboratory (LANL), Nvidia claims the system will deliver “10 exaflops of peak AI performance."

First teased in early 2021, Venado is designed to accelerate LANL’s modeling, simulation, and data analysis of material science, renewable energy, and energy distribution.

The Register has a note out to Nvidia to clarify the precision of this "AI performance," whether it is INT8, FP16, or something else. Traditionally, supercomputer performance numbers are given in the context of FP64. The system is nonetheless noteworthy because it shows real-world use cases ahead for the Grace-Hopper CPU/GPU Superchips and open season for the HPC chips and systems world, which has been dominated by Intel/Nvidia or more recently AMD/Nvidia.

Announced at GTC this spring, the Grace-Hopper Superchip is a daughterboard that fuses a 72-core Arm-compatible Grace CPU die with an H100 GPU over the company’s 900 GB/s NVLink-C2C interconnect tech. The Superchip boasts 512GB LPDDR5x DRAM and 80GB of HBM3 video memory.

For workloads that aren’t yet GPU accelerated, Nvidia’s Grace CPU-only Superchip swaps the H100 GPU in favor of a second CPU die for a total of 144 cores and 1TB of DRAM.

Together, LANL will have access to a “true heterogeneous environment that will be built on our platform, and will allow them to use the same programming model across both and get optimum performance across not just their GPU-accelerated apps, but that long tail of non-CPU-accelerated apps,” Harris said.

Nvidia preps for super year of computing

Venado is far from Nvidia’s only supercomputing project in development. The chipmaker’s CPUs and GPUs are at the heart of several upcoming systems, including the Swiss National Supercomputing Centre’s (CSCS) Alps system.

Announced in early 2021, and built in collaboration with HPE, that big beast will replace Piz Daint as a general-purpose research system. And like Venado, it will also use Nvidia’s Grace CPUs when it comes online next year.

Despite now having a complete ecosystem of CPU, GPU, and networking tech, Nvidia isn’t giving up on x86 just yet. The company is also working with the universities of Tsukuba, Japan; Bristol, England; and the Texas Advanced Computer Center (TACC) to develop a wave of x86-based supercomputers using its Hopper H100 GPUs.

What’s more, the successor to Nvidia’s in-house Selene supercomputer, Eos, will also use x86 processors from Intel. The supercomputer is based on Nvidia’s DGX platform and will feature 4,608 H100 GPUs to deliver a claimed 18.4 exaFLOPS of AI computing performance, according to Nvidia. ®

Send us news
3 Comments

HPE targets enterprises with Nvidia-powered platform for tuning AI

'We feel like enterprises are either going to become AI powered, or they're going to become obsolete'

HPE and Nvidia offer 'turnkey' supercomputer for AI training

If you can afford it – pricing's not out yet

HPE says impact of AI on enterprise not 'overstated.' It must be hoping so

Company counting on widespread business adoption to counter server declines

DoE watchdog warns of poor maintenance at home of Frontier exascale system

Report says new QA plan currently being worked up

Dell APJ chief: Industry won't wait for Nvidia H100

Canalys mostly agrees, but thinks GPU giant still has a way to go

Nvidia sees Huawei, Intel in rear mirror as it grapples with China ban

Commerce Sec threatens redesigned AI-capable chips will be curtailed 'the very next day'

Server sales down 31% at HPE as enterprises hack spending

Customers still 'digesting' shipments bought in 2022, says exec in mixed fiscal year for hardware biz

Nvidia’s China-market H20 chips hit another speed bump

Integration woes delay Nvidia's hopes of maintaining grip on Middle Kingdom

After bashing Nvidia for ‘arming’ China, Cerebras's backer G42 alarms US govt with suspected Beijing ties

What was it they say about folks in silicon houses?

HPE to start pumping AI capabilities into Greenlake under Project Ethan

OpsRamp now natively available through IT-as-a-service platform

Thirty-nine weeks: That's how long you'll be waiting for an AI server from Dell

Revenue and net income down, server market flickers, PCs fail to ignite

Nvidia intros the 'SuperNIC' – it's like a SmartNIC, DPU or IPU, but more super

If you're doing AI but would rather not do InfiniBand, this NIC is for you