Nvidia Vera Rubin First Look — 10x Cheaper Inference, 100% Liquid-Cooled
CNBC gets exclusive access to Nvidia's next-gen AI system: two Rubin GPUs, one Vera CPU, 17,000 components, and a $3.5-4M price tag per rack.
The Hardware
CNBC received an exclusive first look at Vera Rubin, Nvidia’s next-generation AI system shipping H2 2026.
The specs:
| Spec | Detail |
|---|---|
| GPUs | 2x Rubin GPUs per system |
| CPU | 1x Vera CPU |
| Components | 17,000 total |
| Cooling | 100% liquid-cooled (Nvidia’s first) |
| Price | $3.5M–$4M per rack (~25% premium over Blackwell) |
| Inference cost | Up to 10x reduction vs. Blackwell |
| Training efficiency | 4x fewer GPUs for MoE models |
Why It Matters
10x Cheaper Inference
This is the headline number. If inference cost drops 10x, it fundamentally changes the economics of deploying AI at scale. Applications that were too expensive to run become viable. Real-time AI processing in more contexts becomes practical.
First 100% Liquid-Cooled System
Traditional data center cooling uses massive amounts of water through evaporative systems. Vera Rubin’s closed-loop liquid cooling reduces water consumption — a significant factor as AI data centers face growing environmental scrutiny.
For context: a recent study found AI data centers emitted 32-80 million tonnes of CO2 in 2025, roughly equivalent to a small European country.
4x Training Efficiency for MoE Models
Mixture-of-Experts (MoE) architectures — used by GPT-4, Nemotron 3, and DeepSeek — are becoming the standard for frontier models. Vera Rubin is specifically optimized for this architecture, requiring 4x fewer GPUs to train equivalent models.
The Price Premium
At $3.5-4M per rack, Vera Rubin carries a ~25% premium over Grace Blackwell systems. But the economics work out:
- 10x inference cost reduction means each rack does far more useful work
- 4x training efficiency means fewer racks needed for model development
- Liquid cooling reduces ongoing operational costs
The total cost of ownership likely favors Vera Rubin despite the higher sticker price.
Who’s Buying
Meta has already committed to deploying Vera Rubin in its data centers by 2027 — alongside its massive $60B AMD deal. Running both Nvidia and AMD infrastructure simultaneously gives Meta maximum flexibility and negotiating leverage.
The Competitive Landscape
| System | Generation | Shipping |
|---|---|---|
| Nvidia Grace Blackwell | Current | Now |
| AMD MI450 Instinct | Current | H2 2026 |
| Nvidia Vera Rubin | Next-gen | H2 2026 |
| Google TPU v6 | Current | Now |
What to Watch
- Production benchmark comparisons vs. Blackwell
- Liquid cooling infrastructure requirements for existing data centers
- Whether the 10x inference claim holds across diverse workloads
- Pricing impact on cloud AI services (AWS, Azure, GCP)
The next generation of AI infrastructure is taking shape.
Sources: CNBC (exclusive), NVIDIA Newsroom