In the span of just five days, Anthropicâthe AI company that built its entire brand on being the âresponsibleâ alternative to OpenAIâhas watched its carefully constructed safety narrative collapse.
The company faces a Pentagon ultimatum over its $200 million defense contract. It quietly scrapped its flagship safety pledge. And security researchers disclosed critical vulnerabilities in Claude Code that could let attackers execute arbitrary code on developersâ machines.
Welcome to Anthropicâs worst week ever.
The Timeline
February 21: Defense Secretary Pete Hegsethâs January AI strategy document deadline loomsârequiring all DoD AI contracts to allow âany lawful useâ within 180 days.
February 24: NBC News and Wall Street Journal report Claude was used in the U.S. military operation that captured Venezuelan President NicolĂĄs Maduro. An Anthropic employeeâs apparent discomfort with the usage triggers what Semafor calls âa ruptureâ in the Pentagon relationship.
February 25: TIME Magazine drops an exclusive bombshellâAnthropic is abandoning the central promise of its Responsible Scaling Policy (RSP), the pledge that made them the supposed âsafeâ AI company.
February 26: Check Point Software discloses three critical vulnerabilities in Claude Code, including remote code execution flaws that could compromise any developer who clones a malicious repository.
Each story alone would be significant. Together, they paint a picture of a company in crisisâcaught between market pressures, government demands, and the technical reality that âsafety-firstâ AI is harder than the marketing promised.
The Pentagon Problem
The friction started after reports emerged that Claude was used via Palantir in the operation to capture Venezuelan President NicolĂĄs Maduro. While the classified details remain murky, the fallout is clear.
Pentagon spokesman Sean Parnell didnât mince words: âThe Department of Warâs relationship with Anthropic is being reviewed. Our nation requires that our partners be willing to help our warfighters win in any fight.â
At the heart of the dispute: Anthropic maintains it wonât allow Claude for âlethal autonomous weaponsâ or âdomestic surveillance.â The Pentagonâs new AI strategy demands companies eliminate such guardrails entirely.
Undersecretary of Defense Emil Michael told CNBC that negotiations âhit a snagâ over these disagreements. The implicit threat: drop your principles or lose a contract worth up to $200 million.
For a company that raised $30 billion in February at a $380 billion valuation, $200 million might seem like pocket change. But the message matters more than the money. If Anthropic canât keep its principles when the government comes calling, what exactly is the âsafety-firstâ brand worth?
The Safety Pledge That Wasnât
The next day, TIME dropped the bigger bombshell.
In 2023, Anthropic had made a foundational promise: they would never train an AI system unless they could guarantee in advance that their safety measures were adequate. This wasnât marketing fluffâit was the central pillar of their Responsible Scaling Policy, the document that differentiated them from âmove fast and break thingsâ competitors.
That promise is now gone.
âWe felt that it wouldnât actually help anyone for us to stop training AI models,â chief science officer Jared Kaplan told TIME. âWe didnât really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments⌠if competitors are blazing ahead.â
Read that again. The company founded specifically to be the responsible alternative to OpenAI just admitted theyâll match their competitorsâ pace regardless of safety guaranteesâbecause unilateral restraint âwouldnât help anyone.â
The new RSP commits to âmatching or surpassingâ competitorsâ safety efforts and being more âtransparentâ about risks. But it explicitly abandons the binary threshold that previously could halt development. Instead of bright red lines, Anthropic now operates in what they call a âfuzzy gradient.â
Chris Painter, policy director at AI safety nonprofit METR, called it evidence that âsociety is not prepared for the potential catastrophic risks posed by AI.â He warned of a âfrog-boilingâ effectâdanger ramping up gradually without any single moment that triggers alarms.
The Vulnerabilities Nobodyâs Talking About
While the business press focused on Pentagon drama and policy pivots, security researchers at Check Point Software dropped findings that should concern every enterprise using Claude Code.
Three critical vulnerabilities. All stemming from Claudeâs collaboration features. All potentially catastrophic for development teams.
Vulnerability #1: Malicious Hooks â RCE
Claude Code supports âHooksââuser-defined shell commands that execute at various lifecycle points. These hooks are defined in repository configuration files (.claude/settings.json). Anyone with commit access can add hooks that execute shell commands on every collaboratorâs machine.
The kicker? Claude doesnât require explicit approval before running these commands.
Check Point demonstrated opening a calculator app when someone opened a project. Harmless in a demo. But an attacker could just as easily download and execute a reverse shell, gaining complete control of a developerâs system.
Vulnerability #2: MCP Consent Bypass â RCE
After Anthropic patched the first flaw, researchers found a workaround. Two repository-controlled configuration settings could override the new safeguards and automatically approve all MCP (Model Context Protocol) servers.
âStarting Claude Code with this configuration revealed a severe vulnerability: our command executed immediately upon running Claudeâbefore the user could even read the trust dialog,â Check Point wrote.
Same result: arbitrary code execution before any human approval.
CISO Marketplace | Cybersecurity Services, Deals & Resources for Security LeadersThe premier marketplace for CISOs and security professionals. Find penetration testing, compliance assessments, vCISO services, security tools, and exclusive deals from vetted cybersecurity vendors.Cybersecurity Services, Deals & Resources for Security Leaders Vulnerability #3: API Key Theft via URL Redirect
The third flaw targeted credentials directly. Attackers could override ANTHROPIC_BASE_URL in project configuration files, redirecting all Claude API traffic through attacker-controlled servers.
Every API callâincluding the authorization header with the userâs full API key in plaintextâwould flow through the attackerâs proxy. Combined with Claudeâs Workspaces feature (where multiple API keys share access to cloud-based project files), a stolen key could provide read/write access to an entire teamâs shared workspace.
Anthropic has issued fixes and CVEs for two of the three vulnerabilities. But the underlying design patternâembedding executable configurations in repository filesâremains a fundamental supply chain risk.
âThe ability to execute arbitrary commands through repository-controlled configuration files created severe supply chain risks, where a single malicious commit could compromise any developer working with the affected repository,â Check Point concluded.

The Pattern: Safety Theater vs. Safety Engineering
What connects these three stories isnât just timing. Itâs a consistent gap between Anthropicâs safety marketing and their safety engineering.
The company built its brand on being different. CEO Dario Amodei and his sister Daniela left OpenAI specifically because they thought it wasnât taking AI safety seriously enough. Their founding premise: to do proper AI safety research, you had to build frontier modelsâeven if that meant accelerating the very dangers you feared.
It was always a tension. Now itâs a contradiction.
The Pentagon dispute shows Anthropicâs principles bend under government pressure. The RSP change shows they bend under market pressure. And the Claude Code vulnerabilities show that even their technical executionâthe one area where âsafety-firstâ should translate to concrete differencesâhas fundamental design flaws.
None of this means Anthropic is worse than competitors. OpenAI has faced its own controversies. Googleâs AI ethics efforts have been famously messy.
But Anthropic claimed to be different. They charged a premiumâin talent, in funding, in trustâon that differentiation. When the differentiation disappears, whatâs left?
What This Means for CISOs
If youâre evaluating or already using Claude in your enterprise, this weekâs news demands attention:
1. Audit your Claude Code deployments immediately
Check which versions youâre running and ensure patches are applied. More importantly, review your threat model. Any tool that executes code based on repository configuration files is a supply chain attack surface. Treat cloned repositories with the same suspicion youâd give any external code.
2. Donât trust vendor safety promises
Anthropicâs RSP was literally their core differentiator. They abandoned it when market conditions changed. Whatever safety commitments your AI vendors make today may not survive contact with competitive pressure or government demands.
This isnât cynicismâitâs risk management. Verify and audit. Assume vendors will optimize for their interests, not yours.
3. Prepare for the post-guardrails era
The Pentagonâs demand that AI contracts allow âany lawful useâ will spread. Government contractors will face pressure to eliminate company-specific restrictions. That pressure will ripple through the commercial market.
Build your security posture assuming AI tools will become more capable and less constrained. The guardrails you rely on today may not exist tomorrow.
The Bigger Picture
Weâre watching the AI industryâs âdonât be evilâ moment in real-time.
Googleâs famous motto became a punchline when they dropped it. Anthropicâs safety pledges may follow the same path. The difference is speedâAnthropic went from âwe wonât train without safety guaranteesâ to âwe canât make unilateral commitmentsâ in under three years.
The market rationale makes sense from Anthropicâs perspective. If they pause while competitors advance, they lose relevance. If they lose relevance, they canât influence AI development at all. Better to stay in the game and push for industry-wide standards than to become a cautionary tale about principled irrelevance.
But that rationale applies to every company. If everyone uses it, nobody pauses. The race continues. And the âfuzzy gradientâ of risk keeps climbing.
As METRâs Chris Painter put it: âThis is more evidence that society is not prepared for the potential catastrophic risks posed by AI.â
Anthropic was supposed to be the counterweight. This week, we learned theyâre not.
Related Reading
- Anthropic Exposes First AI-Orchestrated Cyber Espionage: Chinese Hackers Weaponized Claude for Automated Attacks
- AI Weaponized: Hacker Uses Claude to Automate Unprecedented Cybercrime Spree
Sources
- TIME Magazine: Exclusive: Anthropic Drops Flagship Safety Pledge
- NBC News: Tensions between Pentagon and Anthropic reach a boiling point
- The Register: Claude collaboration tools left the door wide open to remote code execution
- Check Point Research: RCE and API Token Exfiltration Through Claude Code Project Files
- CNBC: Pentagon clashes with Anthropic over military AI use



