Skip to main content
Back to Blog
AI · 1 min read

Anthropic Drops Its Core Safety Pledge — RSP v3.0 Marks a Philosophical Shift

The 'safety-first' AI lab removes its commitment to never train more powerful models without adequate safeguards, citing competitive pressure from rivals.

anthropic ai-safety responsible-scaling governance policy

What Changed

Anthropic has overhauled its 2023 Responsible Scaling Policy (RSP), removing the core commitment that defined its identity: never train a more powerful AI model unless adequate safety measures were first guaranteed.

The new RSP v3.0 commits to matching competitor safety efforts rather than holding unilateral lines.

“We didn’t really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments if competitors are blazing ahead.” — Jared Kaplan, Anthropic CSO, to TIME

Before vs. After

RSP v1 (2023)RSP v3 (2026)
Core promiseWon’t train beyond safety capacityWill match industry standards
TriggerInternal safety assessmentCompetitive landscape
AccountabilitySelf-imposed hard stopQuarterly Risk Reports
PhilosophyLead on safetyDon’t fall behind on safety

Why This Is Significant

Anthropic was founded in 2021 specifically as a safety-focused alternative to OpenAI. Dario and Daniela Amodei left OpenAI over disagreements about safety priorities. The original RSP was the concrete expression of that founding vision.

This rollback signals something broader: the competitive dynamics of the AI race are now overriding safety commitments across the industry.

METR’s policy director put it bluntly: Anthropic “believes it needs to shift into triage mode because methods to assess and mitigate risk are not keeping up with the pace of capabilities.”

The Cascade Effect

When Anthropic published the original RSP in 2023, OpenAI and Google DeepMind both adopted similar frameworks. If the originator now says unilateral safety commitments are impractical, it gives every other lab permission to relax their own.

What Remains

Anthropic isn’t abandoning safety entirely. RSP v3.0 includes:

  • Quarterly Risk Reports — public documentation of safety assessments
  • Frontier Safety Roadmaps — forward-looking plans published regularly
  • Industry-matching standard — won’t fall below what competitors do

But the philosophical shift is unmistakable: from “we lead on safety” to “we keep pace on safety.”

Context: The Worst 48 Hours

This announcement landed the same week as:

  • A Pentagon ultimatum to remove Claude’s ethical guardrails for military use
  • The Vercept acquisition signaling acceleration on computer-use agents

Anthropic is navigating the tension between its founding principles and the realities of a $380B valuation in a market that rewards speed.


Sources: TIME (exclusive), Bloomberg, CNN, Anthropic RSP v3.0

Comments

Chat