NVIDIA, Google Cloud | Vera Rubin A5X, Agentic AI 2026

1. Partnership Context | 10 Years of Full-Stack Co-Engineering

NVIDIA and Google Cloud have been co-engineering a full-stack AI platform for more than a decade. The collaboration is not a standard reseller agreement. It operates across every technology layer: performance-optimized libraries, AI frameworks, system-level firmware, and enterprise cloud services. The goal has always been to ensure that what NVIDIA ships in silicon and software runs optimally on Google Cloud's infrastructure without requiring customers to manage the integration themselves.

That foundation is now being extended to two of the most consequential AI categories in 2026: agentic AI, which refers to autonomous systems capable of managing complex multi-step workflows without human-in-the-loop direction at every step, and physical AI, which covers robots, autonomous machines, and digital twins operating in factory and industrial environments.

The Google Cloud Next conference in Las Vegas, running this week, is where the partnership's latest tier was announced publicly. The announcements fall into four discrete categories: next-generation GPU infrastructure, edge cloud AI access, confidential computing, and agentic AI frameworks.

BY THE NUMBERS

80,000

Single-site max

960,000

Multisite max

10x

Inference improvement

10x

Throughput improvement

2. A5X Instances | Vera Rubin NVL72, 960,000-GPU Multisite Scale

The headline infrastructure announcement is the A5X bare-metal instance, powered by the NVIDIA Vera Rubin NVL72 rack-scale system. This is the first time Google Cloud has offered Vera Rubin-architecture GPUs to customers, and the scale it enables is unprecedented in the cloud compute market.

A5X | Technical Specifications

GPU: NVIDIA Vera Rubin NVL72
NIC: NVIDIA ConnectX-9 SuperNICs
Networking: Next-generation Google Virgo
Single-site scale: Up to 80,000 Rubin GPUs
Multisite scale: Up to 960,000 Rubin GPUs
Instance type: Bare-metal

Performance Gains vs. Prior Generation

Inference cost per token: 10x lower
Token throughput per megawatt: 10x higher
Architecture approach: Extreme codesign across chips, systems, and software
Target workloads: Frontier model training, large-scale inference, agentic and physical AI

The NVIDIA ConnectX-9 SuperNICs, combined with Google's next-generation Virgo networking fabric, are what make the 960,000-GPU multisite figure achievable. The networking layer is the constraint at that scale; both companies have invested in eliminating it. Customers running the largest AI model training and inference workloads — including frontier model labs and enterprises deploying production agentic systems — will have access to infrastructure that was not commercially available at this scale 12 months ago.

For context on NVIDIA's current GPU roadmap beyond Vera Rubin, see ObjectWire's coverage of the Blackwell B300 data center demand surge and the NVIDIA hub .

3. Gemini on Google Distributed Cloud | NVIDIA Blackwell at the Edge

The second major announcement is a preview of Google Gemini running on Google Distributed Cloud (GDC), powered by both NVIDIA Blackwell and NVIDIA Blackwell Ultra GPUs. GDC is Google's on-premises and edge cloud product, designed for customers who cannot move workloads to public cloud infrastructure due to data residency requirements, air-gap security mandates, or latency constraints.

Putting Gemini on GDC hardware backed by Blackwell GPUs means regulated industries — defense, healthcare, financial services, and government — can run Google's flagship AI models on infrastructure they physically control, without a network dependency on Google Cloud's US data centers. This is a significant expansion of where Gemini can operate, and it directly competes with Microsoft's Azure Government and AWS GovCloud offerings in the secure-enclave AI market.

Why GDC + Blackwell Matters:

Customers with data sovereignty requirements can now run Gemini inference on Blackwell hardware inside their own facilities. This removes the final objection point for regulated-industry AI deployments that previously could not use cloud-hosted foundation models.

For deeper coverage of Google's AI model strategy, see ObjectWire's analysis of Gemini Embedding 2 and the Google hub .

4. Confidential VMs | NVIDIA Blackwell GPUs in Secure Enclaves

The third announcement introduces confidential VMs with NVIDIA Blackwell GPUs on Google Cloud. Confidential computing refers to hardware-enforced isolation of data in use — meaning that neither the cloud provider nor other tenants on shared infrastructure can access the memory or computation state of a running workload.

Adding Blackwell GPU support to confidential VMs extends that protection to GPU-accelerated AI workloads for the first time at this tier. The practical implication: enterprises processing sensitive data through AI pipelines — medical imaging, proprietary financial models, classified text analysis — can now run those workloads on Blackwell GPU instances with cryptographic assurance that the cloud operator cannot inspect the computation.

This capability is directly relevant to the HIPAA, FedRAMP, and EU AI Act compliance contexts that have slowed enterprise AI adoption in regulated verticals. It removes a class of objections entirely.

5. Agentic AI Stack | Gemini Enterprise Agent Platform, Nemotron, NeMo

The fourth announced capability is the one with the broadest product surface: agentic AI on the Gemini Enterprise Agent Platform, built with NVIDIA Nemotron open models and the NVIDIA NeMo framework.

This is a production-ready stack for building, training, fine-tuning, and deploying AI agents that can manage complex enterprise workflows. The combination means customers are not locked into either Google's or NVIDIA's model ecosystem exclusively. Gemini serves as the orchestration and reasoning layer via Google's Agent Platform, while Nemotron open models provide an alternative inference path for customers who want task-specific open-weight models rather than Gemini's closed API. NeMo supplies the fine-tuning and deployment framework.

Gemini Enterprise Agent Platform

Google's managed platform for building and deploying multi-agent systems. Handles orchestration, tool use, memory, and task routing across agent networks.

NVIDIA Nemotron Open Models

NVIDIA's family of open-weight language models, optimized for enterprise reasoning tasks. Provides a non-proprietary inference alternative to Gemini for cost-sensitive or compliance-constrained deployments.

NVIDIA NeMo Framework

Training, fine-tuning, and deployment framework for large language models and multimodal models. Enables customers to customize Nemotron models on proprietary data before deploying to production agents.

The physical AI extension of this stack targets robotics and digital twin deployments specifically. NVIDIA's existing ecosystem for physical AI, including Isaac for robotics simulation and Omniverse for digital twins, connects to this cloud stack, allowing developers to train robot control policies in simulation and deploy them to physical hardware via Google Cloud's infrastructure.

For broader context on where Jensen Huang sees AI agents taking enterprise software value, see ObjectWire's coverage of Jensen Huang's AI agent thesis . For Google's broader agentic strategy, see the Google Agentic Vision analysis.

6. What This Means | AI Factory Model at Hyperscale

Taken together, the four announcements describe a single architectural vision: the AI factory. The term, which Donaldson has used and which appears in the NVIDIA blog post authored by Ian Buck, refers to the idea that AI production infrastructure should be purpose-built and vertically integrated, the same way semiconductor fabs are purpose-built for chip production. Everything is optimized for AI throughput, from the GPU silicon to the networking fabric to the software stack on top.

The A5X/Vera Rubin instances form the raw compute layer. Confidential VMs extend it to regulated workloads. GDC + Blackwell extends it to edge and on-premises deployments. The Gemini + Nemotron + NeMo stack provides the software abstractions to actually build and run agents on top. The full picture is a production-grade AI platform that spans cloud, edge, and on-premises, with no layer requiring a custom integration from the customer.

Key Takeaway:

For enterprises that have been waiting for a commercially viable, compliance-ready, hyperscale AI stack that does not require building the infrastructure from scratch, this set of announcements eliminates most of the remaining blockers. The question is now deployment velocity, not infrastructure availability.

Follow ObjectWire's NVIDIA hub and Google hub for continuing coverage of this partnership and the broader AI infrastructure buildout.