OpenAI Launches Safety Fellowship After New Yorker Probe

OpenAI on Monday announced a new Safety Fellowship, a pilot program that will fund a cohort of external researchers to conduct independent work on AI safety and alignment from September 14, 2026, through February 5, 2027. Applications are open until May 3, with successful candidates to be notified by July 25. The announcement landed hours after The New Yorker published an investigation raising serious questions about OpenAI's internal safety culture, the departure of key safety researchers, and the company's pattern of announcing safety initiatives in response to public criticism.

Fellowship Structure | Stipends, API Credits, No Internal Access

The program is open to researchers from computer science, social sciences, cybersecurity, privacy, and human-computer interaction, with OpenAI emphasizing research ability and technical judgment over academic credentials. Priority research areas include safety evaluation, robustness, scalable mitigation strategies, privacy-preserving methods, agentic oversight, and high-severity misuse domains.

Fellows will receive a monthly stipend, computing resources, and mentorship from OpenAI researchers. Workspace will be available at Constellation in Berkeley, California, though fellows may also work remotely. Notably, OpenAI specified that participants will receive API credits but not access to internal systems. Each fellow is expected to produce a research output, such as a paper, benchmark, or dataset, by the program's end.

Detail	Description
Duration	September 14, 2026 to February 5, 2027
Application Deadline	May 3, 2026
Notification	July 25, 2026
Compensation	Monthly stipend + computing resources + API credits
Location	Constellation, Berkeley, CA (remote option available)
Expected Output	Research paper, benchmark, or dataset
Internal Access	API credits only, no internal systems

OpenAI Safety Fellowship program details

The New Yorker Investigation | What the Probe Found

The timing of OpenAI's announcement is difficult to separate from the New Yorker piece, which published the same morning. The investigation, based on interviews with current and former employees, documented a pattern of safety researchers being sidelined or pushed out when their findings conflicted with product release schedules. It detailed the departures of Ilya Sutskever, Jan Leike, and other members of the former Superalignment team, which was dissolved in 2024 after receiving less than the 20% of compute resources it had been promised.

The New Yorker report also raised questions about OpenAI's internal red-teaming processes, alleging that safety evaluations were sometimes compressed to fit product launch timelines rather than the other way around. Several former employees described a culture where raising safety concerns was treated as a career risk rather than a professional obligation.

OpenAI's Safety Track Record | A Pattern of Reactive Announcements

This is not the first time OpenAI has announced safety measures under reputational pressure. The company launched its Preparedness Framework in December 2023 following employee unrest over the brief ouster of CEO Sam Altman. It published its safety evaluation protocols for GPT-4o after researchers flagged voice-mode risks publicly. And it created the Safety Advisory Group after Congressional testimony in which Altman acknowledged the need for external oversight.

The fellowship follows the same pattern: external pressure, followed by an initiative designed to demonstrate openness. Critics, including former OpenAI safety lead Jan Leike, have argued that external fellowship programs are insufficient substitutes for robust internal safety processes. API-level access, they note, does not allow researchers to study the behaviors, training dynamics, or failure modes of frontier models in the way that internal access would.

What the Fellowship Can and Cannot Do | Structural Limitations

The fellowship's scope is deliberately bounded. Fellows will study AI safety from the outside, using the same API endpoints available to any paying customer. They will not be able to examine model weights, training data, RLHF reward signals, or internal evaluation logs. They will not participate in pre-deployment safety reviews or red-teaming exercises. Their mentorship from OpenAI researchers will be advisory, not collaborative in the sense of joint model access.

This means the fellowship is better suited to studying downstream harms, such as misuse patterns, bias in outputs, and jailbreak robustness, than to investigating the fundamental alignment properties of frontier models. For researchers interested in scalable oversight, interpretability, or deceptive alignment, the lack of internal access is a significant constraint.

That said, the program does create a funded pipeline for independent safety research. The stipend and compute credits remove financial barriers for researchers who might otherwise need to secure their own grants. And the Berkeley workspace at Constellation, a research hub cofounded by former Anthropic and OpenAI alumni, provides a physical community for safety-focused work.

Industry Context | AI Safety Funding Across the Sector

OpenAI's fellowship arrives as AI safety funding from labs and governments is increasing, though critics argue it remains disproportionately small relative to capabilities spending. Anthropic allocates a larger share of its workforce to safety research than any other major lab, currently estimated at roughly 30% of its technical staff. Google DeepMind operates its own safety division. Meta's FAIR lab publishes safety-adjacent research but has faced criticism for open-sourcing models without adequate red-teaming.

Government-backed initiatives are also expanding. The U.S. AI Safety Institute, housed within NIST, has begun pre-deployment evaluations of frontier models. The UK AI Safety Institute conducted the first international evaluation exercise at the Bletchley Park summit. And the EU AI Act, which entered enforcement in phases starting in 2025, mandates safety assessments for high-risk AI systems deployed in Europe.

Against this landscape, OpenAI's fellowship is a modest step. It funds a small cohort for five months. It does not change the company's internal safety governance, its compute allocation to safety research, or its pre-deployment evaluation procedures. What it does is create a visible external program that OpenAI can point to when questioned about its commitment to safety, which, given the pace of its product expansion, is a question the company increasingly faces.

Sources

OpenAI Safety Fellowship announcement (openai.com)
Vox coverage of the fellowship and New Yorker timing (vox.com)
The New Yorker investigation into OpenAI safety culture (newyorker.com)
Bloomberg report on OpenAI safety researcher departures (bloomberg.com)