SANTA CLARA, CA — On Tuesday, March 10, 2026, Jim Keller's Tenstorrent delivered the most credible challenge yet to Nvidia's dominance of the AI workstation market. The company unveiled the TT-QuietBox 2 — a liquid-cooled desktop AI powerhouse built on the RISC-V architecture, packaged with a fully open-source software stack and priced at $9,999.
That price point is nearly $2,000 cheaper than the original QuietBox, signaling a deliberate push to move RISC-V AI hardware from niche research labs into the mainstream developer's office. The QuietBox 2 is the first system of its kind to deliver teraflop-class local LLM inference without relying on a traditional GPU — and without locking developers into Nvidia's proprietary CUDA ecosystem.
TT-QuietBox 2 — At a Glance
| Specification | Detail |
|---|---|
| {spec} | {detail} |
Breaking the CUDA Lock
The QuietBox 2's most strategically significant feature is not its performance — it is its independence. CUDA, Nvidia's proprietary parallel computing platform, has been the de facto standard for AI model training and inference since the first deep learning boom. The result is a decade-long lock-in: most AI frameworks, model checkpoints, and production pipelines are optimized specifically for CUDA, making migration to alternative hardware extremely costly even when alternatives exist.
Tenstorrent's RISC-V approach offers a fundamentally different value proposition. Because RISC-V is an open instruction set architecture, developers have full visibility — and full control — from the compiler layer down to the kernel. Tenstorrent describes this as transparency "from compiler to kernel," enabling engineers to optimize AI models at a granular hardware level that proprietary GPU architectures actively prevent.
The Open-Source Software Edge: TT-Buda & TT-Metalium
Tenstorrent's software strategy is built on two open-source components:
TT-Buda — The Compiler
TT-Buda is Tenstorrent's open-source AI compiler suite. It handles the translation of high-level model definitions (PyTorch, JAX, ONNX) into optimized instruction sequences for the RISC-V hardware. Because the compiler is open source, developers can inspect, modify, and contribute optimization passes — something impossible with Nvidia's proprietary TensorRT.
TT-Metalium — The Kernel Suite
TT-Metalium provides the low-level kernel primitives that execute directly on the Tenstorrent silicon. The suite's open-source nature means that debugging complex model behavior — tracing a numerical instability, for example, or profiling a memory bandwidth bottleneck — can be done with full hardware visibility rather than relying on opaque vendor profiling tools.
| Advantage | What It Means in Practice |
|---|---|
| {adv} | {meaning} |
120 Billion Parameters — Locally, Silently
The base configuration of the QuietBox 2 supports running models with up to 120 billion parameters entirely locally — covering specialized variants of Llama 3, Grok-1, and comparable open-weight models. For enterprises and researchers with data privacy requirements that prohibit sending inference requests to cloud APIs, this is the critical capability: frontier-class reasoning on hardware you physically control, in a room you can work in.
The liquid cooling system is central to that last point. Previous RISC-V AI systems capable of this parameter scale have required server rack hardware — noisy, power-hungry, and incompatible with standard office infrastructure. The QuietBox 2's high-efficiency liquid cooling loop keeps the processors at peak performance levels without the acoustic footprint of air-cooled alternatives, making it deployable in a standard developer workspace.
Market Context: Who Is This For?
The QuietBox 2 is not positioned as a consumer product. At $9,999, it targets three specific buyer profiles:
- AI researchers who need full hardware transparency to debug and understand model behavior — not just benchmark it
- Enterprise security and compliance teams with data residency requirements that prohibit cloud inference for sensitive workloads
- Independent developers and labs exploring RISC-V as an alternative to Nvidia-dependent infrastructure ahead of potential CUDA ecosystem disruption
The competitive framing is explicit: Tenstorrent is not trying to match Nvidia's training throughput on large clusters. It is targeting the inference and local deployment market — specifically the developers building production applications who currently rent cloud GPU time for every inference call and want an alternative.
Discussion
Sign in to join the conversation
Your comments appear live in our Discord server — every post grows the community.
Every comment appears live in our Discord server.
Join to see the full conversation, get notified on new articles, and connect with the community.
Comments sync to our ObjectWire Discord · Tenstorrent Disrupts AI Workstation Market with $9,999 RISC-V .