Anthropic Drops Its Core Safety Pledge — RSP v3.0 Marks a Philosophical Shift
The 'safety-first' AI lab removes its commitment to never train more powerful models without adequate safeguards, citing competitive pressure from rivals.
What Changed
Anthropic has overhauled its 2023 Responsible Scaling Policy (RSP), removing the core commitment that defined its identity: never train a more powerful AI model unless adequate safety measures were first guaranteed.
The new RSP v3.0 commits to matching competitor safety efforts rather than holding unilateral lines.
“We didn’t really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments if competitors are blazing ahead.” — Jared Kaplan, Anthropic CSO, to TIME
Before vs. After
| RSP v1 (2023) | RSP v3 (2026) | |
|---|---|---|
| Core promise | Won’t train beyond safety capacity | Will match industry standards |
| Trigger | Internal safety assessment | Competitive landscape |
| Accountability | Self-imposed hard stop | Quarterly Risk Reports |
| Philosophy | Lead on safety | Don’t fall behind on safety |
Why This Is Significant
Anthropic was founded in 2021 specifically as a safety-focused alternative to OpenAI. Dario and Daniela Amodei left OpenAI over disagreements about safety priorities. The original RSP was the concrete expression of that founding vision.
This rollback signals something broader: the competitive dynamics of the AI race are now overriding safety commitments across the industry.
METR’s policy director put it bluntly: Anthropic “believes it needs to shift into triage mode because methods to assess and mitigate risk are not keeping up with the pace of capabilities.”
The Cascade Effect
When Anthropic published the original RSP in 2023, OpenAI and Google DeepMind both adopted similar frameworks. If the originator now says unilateral safety commitments are impractical, it gives every other lab permission to relax their own.
What Remains
Anthropic isn’t abandoning safety entirely. RSP v3.0 includes:
- Quarterly Risk Reports — public documentation of safety assessments
- Frontier Safety Roadmaps — forward-looking plans published regularly
- Industry-matching standard — won’t fall below what competitors do
But the philosophical shift is unmistakable: from “we lead on safety” to “we keep pace on safety.”
Context: The Worst 48 Hours
This announcement landed the same week as:
- A Pentagon ultimatum to remove Claude’s ethical guardrails for military use
- The Vercept acquisition signaling acceleration on computer-use agents
Anthropic is navigating the tension between its founding principles and the realities of a $380B valuation in a market that rewards speed.
Sources: TIME (exclusive), Bloomberg, CNN, Anthropic RSP v3.0