OBJECTWIRE

Independent · Verified · In-Depth

Tech

Nvidia Groq $20B Inference Deal | LPX Platform, 3GW OpenAI Commitment at GTC 2026

Nvidia licensed Groq's full LPU patent portfolio for $20 billion on December 24, 2025, the largest deal in its history, and will unveil the LPX inference platform at GTC March 2026

📖 9 min read

At a Glance: On December 24, 2025, Nvidia finalized a $20 billion non-exclusive perpetual license for Groq's full LPU patent portfolio and software stack, the largest transaction in Nvidia's history. The deal transferred 80-90% of Groq's workforce, including founder Jonathan Ross and President Sunny Madra. At GTC 2026 in San Jose, Nvidia will unveil the LPX inference platform with 64-256 LPUs per rack. OpenAI has committed 3GW of dedicated inference capacity using the new system.

What Happened | The $20 Billion Christmas Eve Deal

Nvidia finalized a $20 billion non-exclusive perpetual license with Groq on December 24, 2025, gaining access to Groq's full patent portfolio and software stack for inference optimization. The agreement is structured as a license, not an acquisition, meaning Groq continues to operate independently under new leadership with its GroqCloud service.

The deal included transfer of physical assets and roughly 80 to 90 percent of Groq's workforce, including core engineering teams. Key personnel who moved to Nvidia include founder Jonathan Ross and President Sunny Madra. Groq had reached a $6.9 billion valuation after a $750 million funding round in September 2025, just three months before the deal closed.

Deal ParameterDetail
Deal value
$20 billion (non-exclusive perpetual license)
Closing date
December 24, 2025
Structure
Patent license + workforce transfer, not full acquisition
Scope
Full Groq patent portfolio, software stack, physical assets
Workforce transferred
80-90% of Groq employees, including engineering core
Key personnel
Jonathan Ross (founder), Sunny Madra (President)
Groq post-deal
Independent operations continue under new leadership (GroqCloud)
Prior Groq valuation
$6.9 billion (September 2025, $750M round)
Nvidia deal rank
Largest transaction in company history
Nvidia-Groq deal structure, December 2025

What Is Groq's LPU | Why Nvidia Paid $20 Billion for It

Groq's Language Processing Unit (LPU) employs deterministic execution with large on-chip SRAM, hundreds of megabits per chip, to eliminate the bandwidth bottlenecks common in GPU-based inference. Unlike GPUs, which rely on high-bandwidth memory (HBM) that creates latency during sequential token generation, the LPU keeps all active data on-die.

In public demonstrations prior to the deal, Groq showed 10,000 thought tokens generated in approximately 2 seconds, a throughput rate that outpaced GPU-based inference by an order of magnitude for sequential decode workloads. The architecture is purpose-built for the kind of token-by-token generation that dominates deployed AI applications: chatbots, code assistants, search, and agentic AI systems requiring real-time responses.

LPU FeatureGPU Comparison
Execution model
Deterministic (predictable latency) vs Stochastic (variable)
Primary memory
Large on-chip SRAM vs Off-chip HBM (HBM2e/HBM3)
Memory bottleneck
Eliminated (on-die) vs Primary constraint (bandwidth-bound)
Token generation speed
~5,000 tokens/sec demonstrated vs ~200-800 tokens/sec typical
Latency profile
Near-constant per token vs Variable, increases with context
Optimal workload
Sequential inference (decode) vs Parallel compute (training + prefill)
Power per token
Significantly lower vs Higher (GPU overhead)
Groq LPU vs traditional GPU architecture for inference workloads

LPX Inference Platform | What Nvidia Will Unveil at GTC 2026

The new platform, referred to as LPX in industry analyses, builds on Groq's LPU for dedicated inference racks. Nvidia is expected to reveal the full specifications at its GTC developer conference in San Jose during March 2026.

LPX SpecificationDetail
Base configuration
64 LPUs per rack
Packaging
32 RealScale ASIC tiles (2 LPUs per tile)
Scaled configuration
256 LPUs per rack (4x base)
Target workload
Low-latency decode, agentic AI, real-time inference
Integration approach
Groq LPU silicon + Nvidia networking/software stack
Customer anchor
OpenAI (3GW dedicated inference capacity committed)
Market positioning
Complementary to Blackwell GPUs, not replacement
Reported LPX inference platform specifications, based on industry analyses

The LPX platform is positioned as complementary to Nvidia's existing Blackwell B300 GPU racks , which dominate training workloads. Jensen Huang has described the strategy as offering “the right silicon for the right workload”: Blackwell for training and prefill, LPX for decode and real-time inference. This mirrors the approach Nvidia took after acquiring Mellanox in 2020 for networking, where the acquired technology became an accelerator within the broader Nvidia ecosystem rather than a standalone product line.

OpenAI's 3GW Commitment | Why It Matters

OpenAI has committed to 3 gigawatts of dedicated inference capacity using the LPX platform, positioning itself as the anchor customer. To contextualize that number: 3GW is roughly equivalent to the power output of three nuclear power plants or the total electricity consumption of a city of 2 million people.

Prior to the Nvidia-Groq deal, OpenAI had been exploring inference alternatives with both Cerebras and Groq directly. The deal effectively channeled those relationships through Nvidia, giving OpenAI a single vendor for both training (Blackwell) and inference (LPX) hardware.

MetricContext
OpenAI inference commitment
3GW dedicated capacity
Power equivalence
~3 nuclear power plants or ~2 million households
Prior OpenAI inference partners
Cerebras (explored), Groq (explored), now Nvidia LPX
Training hardware
Nvidia Blackwell B200/B300 (unchanged)
Combined Nvidia relationship
Training + Inference from single vendor
Inference cost driver
Each ChatGPT query costs ~10x a Google search in compute
OpenAI inference capacity commitment and power context

Deal Timeline | From Groq Funding to GTC Unveil

September 2025

Groq raises $750M at $6.9B valuation

Series D round values Groq at $6.9 billion, validating LPU architecture and GroqCloud traction.

October-November 2025

Nvidia-Groq negotiations accelerate

Nvidia approaches Groq for licensing deal as inference demand projections surge beyond GPU capacity.

December 24, 2025

$20 billion licensing deal finalized

Non-exclusive perpetual license signed. Jonathan Ross and Sunny Madra transfer to Nvidia. 80-90% of workforce follows.

January 2026

Nvidia stock declines ~7% over two sessions

Despite record Q4 earnings, market reacts to $20B deal cost and inference market uncertainty.

February 2026

OpenAI commits 3GW inference capacity

OpenAI signs as anchor customer for LPX platform, committing 3 gigawatts of dedicated inference power.

March 2026

GTC 2026 in San Jose

Nvidia expected to unveil full LPX platform specifications, pricing, and deployment timeline at annual developer conference.

Nvidia's Stock Reaction | Why Markets Sold Off Despite Record Earnings

Nvidia's stock declined approximately 7 percent over two trading sessions following the deal announcement, despite the company reporting record quarterly earnings in the same period. The market reaction reflected several concerns.

ConcernMarket Interpretation
$20B deal size
Largest in Nvidia history, questions about capital allocation discipline
Non-exclusive license
Groq retains right to license LPU to other chipmakers, Nvidia does not get exclusivity
Inference market uncertainty
Unproven at scale whether LPU outperforms next-gen GPUs long-term
GroqCloud independence
Groq continues competing in cloud inference under new leadership
Integration risk
Merging LPU architecture with Nvidia software stack (CUDA) is non-trivial
Valuation multiple
$20B for a company valued at $6.9B three months earlier (2.9x premium)
Market concerns driving Nvidia stock decline after Groq deal

Competitive Landscape | Who Else Builds Inference Hardware

The Nvidia-Groq deal neutralized one of the most promising inference-focused competitors, but several others remain. The inference hardware market is fragmenting as the AI industry recognizes that training and inference require fundamentally different chip architectures.

CompanyInference Approach
Nvidia (LPX, post-Groq)
LPU-based deterministic inference racks, 64-256 LPUs per rack
Nvidia (Blackwell GPUs)
GPU-based inference via H100/B200/B300, flexible but less efficient for decode
Cerebras
Wafer-Scale Engine (WSE-3), 900,000 cores per chip, targeting both training and inference
SambaNova
Reconfigurable dataflow architecture (SN40L), targets enterprise inference
D-Matrix
Digital in-memory computing for inference, early-stage
Qualcomm (Cloud AI 100)
ARM-based inference accelerator for edge and cloud
Intel (Gaudi 3)
AI accelerator targeting price-competitive inference workloads
AMD (MI300X)
192GB HBM3 GPU, inference-capable but optimized for training
Google TPU v5e
Custom inference-optimized TPU for internal Google workloads
AI inference hardware competitive landscape, early 2026

Training vs Inference | Why the AI Industry Is Splitting

The AI hardware market is undergoing a structural shift. For the first five years of the deep learning era (2018-2023), nearly all compute spending went to training: building models. But as models reach production scale, inference, running those models to answer queries, has become the dominant cost center.

FactorTraining vs Inference
Compute pattern
Massively parallel matrix math vs Sequential token generation
Memory access
Batch-optimized, high throughput vs Latency-sensitive, per-token
Cost scaling
Fixed (train once) vs Variable (scales with users and queries)
Hardware optimized for it
GPUs (A100, H100, B200) vs LPUs, TPUs, custom ASICs
Revenue model
One-time capex vs Per-query opex (cost-per-token)
Market share 2026 (est.)
~40% of AI compute spend vs ~60% of AI compute spend
Structural differences between AI training and inference workloads

This is why Nvidia paid $20 billion for Groq's technology: the company that dominates training (Nvidia holds >80% GPU market share) recognized it needed a fundamentally different architecture to win the inference side. The $4 billion photonics deals with Lumentum and Coherent address the networking bottleneck; the Groq deal addresses the compute bottleneck for inference specifically.

💬
“Nvidia buying Groq's inference tech is the clearest signal yet that GPUs alone cannot serve the next billion AI queries. The company that built the training era just admitted it needs different silicon for the inference era.”

📰 Related Stories

Filed under

#Nvidia#Groq#GTC 2026#AI Inference#LPU#OpenAI#Jensen Huang

Discussion

Every comment appears live in our Discord server.

Join to see the full conversation and connect with the community.

Join ObjectWire Discord

Comments sync to our ObjectWire Discord · Nvidia Groq $20B Inference Deal | LPX Platform, 3GW OpenAI Commitment at GTC 2026.

O

Written by

ObjectWire Technology Desk

Technology Reporter