SANTA CLARA, CA — On Tuesday, March 10, 2026, Jim Keller's Tenstorrent delivered the most credible challenge yet to Nvidia's dominance of the AI workstation market. The company unveiled the TT-QuietBox 2 — a liquid-cooled desktop AI powerhouse built on the RISC-V architecture, packaged with a fully open-source software stack and priced at $9,999.
That price point is nearly $2,000 cheaper than the original QuietBox, signaling a deliberate push to move RISC-V AI hardware from niche research labs into the mainstream developer's office. The QuietBox 2 is the first system of its kind to deliver teraflop-class local LLM inference without relying on a traditional GPU — and without locking developers into Nvidia's proprietary CUDA ecosystem.
TT-QuietBox 2 — At a Glance
| Specification | Detail |
|---|---|
| {spec} | {detail} |
Breaking the CUDA Lock
The QuietBox 2's most strategically significant feature is not its performance — it is its independence. CUDA, Nvidia's proprietary parallel computing platform, has been the de facto standard for AI model training and inference since the first deep learning boom. The result is a decade-long lock-in: most AI frameworks, model checkpoints, and production pipelines are optimized specifically for CUDA, making migration to alternative hardware extremely costly even when alternatives exist.
Tenstorrent's RISC-V approach offers a fundamentally different value proposition. Because RISC-V is an open instruction set architecture, developers have full visibility — and full control — from the compiler layer down to the kernel. Tenstorrent describes this as transparency "from compiler to kernel," enabling engineers to optimize AI models at a granular hardware level that proprietary GPU architectures actively prevent.
The Open-Source Software Edge: TT-Buda & TT-Metalium
Tenstorrent's software strategy is built on two open-source components:
TT-Buda — The Compiler
TT-Buda is Tenstorrent's open-source AI compiler suite. It handles the translation of high-level model definitions (PyTorch, JAX, ONNX) into optimized instruction sequences for the RISC-V hardware. Because the compiler is open source, developers can inspect, modify, and contribute optimization passes — something impossible with Nvidia's proprietary TensorRT.
TT-Metalium — The Kernel Suite
TT-Metalium provides the low-level kernel primitives that execute directly on the Tenstorrent silicon. The suite's open-source nature means that debugging complex model behavior — tracing a numerical instability, for example, or profiling a memory bandwidth bottleneck — can be done with full hardware visibility rather than relying on opaque vendor profiling tools.
| Advantage | What It Means in Practice |
|---|---|
| {adv} | {meaning} |
120 Billion Parameters — Locally, Silently
The base configuration of the QuietBox 2 supports running models with up to 120 billion parameters entirely locally — covering specialized variants of Llama 3, Grok-1, and comparable open-weight models. For enterprises and researchers with data privacy requirements that prohibit sending inference requests to cloud APIs, this is the critical capability: frontier-class reasoning on hardware you physically control, in a room you can work in.
The liquid cooling system is central to that last point. Previous RISC-V AI systems capable of this parameter scale have required server rack hardware — noisy, power-hungry, and incompatible with standard office infrastructure. The QuietBox 2's high-efficiency liquid cooling loop keeps the processors at peak performance levels without the acoustic footprint of air-cooled alternatives, making it deployable in a standard developer workspace.
Market Context: Who Is This For?
The QuietBox 2 is not positioned as a consumer product. At $9,999, it targets three specific buyer profiles:
- AI researchers who need full hardware transparency to debug and understand model behavior — not just benchmark it
- Enterprise security and compliance teams with data residency requirements that prohibit cloud inference for sensitive workloads
- Independent developers and labs exploring RISC-V as an alternative to Nvidia-dependent infrastructure ahead of potential CUDA ecosystem disruption
The competitive framing is explicit: Tenstorrent is not trying to match Nvidia's training throughput on large clusters. It is targeting the inference and local deployment market — specifically the developers building production applications who currently rent cloud GPU time for every inference call and want an alternative.