Nvidia Vera Rubin First Look — 10x Cheaper Inference, 100% Liquid-Cooled

The Hardware

CNBC received an exclusive first look at Vera Rubin, Nvidia’s next-generation AI system shipping H2 2026.

The specs:

Spec	Detail
GPUs	2x Rubin GPUs per system
CPU	1x Vera CPU
Components	17,000 total
Cooling	100% liquid-cooled (Nvidia’s first)
Price	$3.5M–$4M per rack (~25% premium over Blackwell)
Inference cost	Up to 10x reduction vs. Blackwell
Training efficiency	4x fewer GPUs for MoE models

Why It Matters

10x Cheaper Inference

This is the headline number. If inference cost drops 10x, it fundamentally changes the economics of deploying AI at scale. Applications that were too expensive to run become viable. Real-time AI processing in more contexts becomes practical.

First 100% Liquid-Cooled System

Traditional data center cooling uses massive amounts of water through evaporative systems. Vera Rubin’s closed-loop liquid cooling reduces water consumption — a significant factor as AI data centers face growing environmental scrutiny.

For context: a recent study found AI data centers emitted 32-80 million tonnes of CO2 in 2025, roughly equivalent to a small European country.

4x Training Efficiency for MoE Models

Mixture-of-Experts (MoE) architectures — used by GPT-4, Nemotron 3, and DeepSeek — are becoming the standard for frontier models. Vera Rubin is specifically optimized for this architecture, requiring 4x fewer GPUs to train equivalent models.

The Price Premium

At $3.5-4M per rack, Vera Rubin carries a ~25% premium over Grace Blackwell systems. But the economics work out:

10x inference cost reduction means each rack does far more useful work
4x training efficiency means fewer racks needed for model development
Liquid cooling reduces ongoing operational costs

The total cost of ownership likely favors Vera Rubin despite the higher sticker price.

Who’s Buying

Meta has already committed to deploying Vera Rubin in its data centers by 2027 — alongside its massive $60B AMD deal. Running both Nvidia and AMD infrastructure simultaneously gives Meta maximum flexibility and negotiating leverage.

The Competitive Landscape

System	Generation	Shipping
Nvidia Grace Blackwell	Current	Now
AMD MI450 Instinct	Current	H2 2026
Nvidia Vera Rubin	Next-gen	H2 2026
Google TPU v6	Current	Now

What to Watch

Production benchmark comparisons vs. Blackwell
Liquid cooling infrastructure requirements for existing data centers
Whether the 10x inference claim holds across diverse workloads
Pricing impact on cloud AI services (AWS, Azure, GCP)

The next generation of AI infrastructure is taking shape.

Sources: CNBC (exclusive), NVIDIA Newsroom