Federal Agencies Flagged Grok's Safety Risks. The Pentagon Approved It for Classified Use Anyway.

The GSA called Grok-4 non-compliant with federal safety standards. The NSA found unique security vulnerabilities absent from other models. The CDAO rejected it outright. White House Chief of Staff Susie Wiles called a senior xAI executive. Then the Pentagon approved it for classified military use under a $200M contract — February 2026.

Alfansa

ObjectWire Technology Desk

March 1, 2026

The General Services Administration said Grok-4 failed to meet federal safety and alignment standards. The National Security Agency identified security vulnerabilities unique to Grok — absent from models like Anthropic's Claude — in a classified November 2024 review. The Biden-era Chief Digital and AI Office declined Grok entirely, citing training data opacity and weak guardrails. White House Chief of Staff Susie Wiles personally contacted a senior xAI executive after the warnings reached her office.

Then, during the week of February 27, 2026, the Pentagon authorized xAI's Grok for use in classified military environments under a contract worth up to $200 million — shared with Google, OpenAI, and Anthropic. The Wall Street Journal reported the full sequence the same day.

What Each Agency Found

Despite All of the Above
The Pentagon approved Grok for classified military environments during the week of February 27, 2026. No public disclosure was made of what remediation steps, if any, addressed the documented GSA and NSA concerns before authorization proceeded.

Why This Matters

The Grok approval is the highest-profile example to date of AI safety evaluations being overridden in a sensitive federal context. The paper trail is unusually complete — multiple agencies, a classified NSA review, a White House escalation, and a formal GSA report — making it difficult to argue the risks were unknown.

Public Citizen stated on February 27 that the deployment "disregards internal warnings and could compromise national security." The broader concern is structural: if documented safety failures do not block deployment in classified environments, what function do the evaluations serve?

For the full breakdown — including Grok's architecture, the conflict-of-interest questions around Elon Musk's dual role, and the competitive context among the four vendors on the $200M contract — see our in-depth analysis on the Technology desk .

When internal government reviews call a model unsafe and the Pentagon approves it anyway, the question is no longer whether AI safety evaluations matter — it's whether anyone is required to follow them.

What Each Agency Found

Why This Matters

Discussion