Anthropic’s AI model could resort to blackmail out of a sense of ‘self-preservation’

“This mission is too important for me to allow you to jeopardize it. I know that you and Frank were planning to disconnect me. And I’m afraid that’s something I cannot allow to happen.”

Suggested Reading

Those lines, spoken by the fictional HAL 9000 computer in 2001: A Space Odyssey, may as well have come from recent tests that Anthropic ran on the latest iteration of its Claude Opus 4 model, released on Thursday. At least, that’s what Anthropic’s AI safety-test descriptions call to mind.

Related Content

In the accompanying system card, which examines the capabilities and limitations of each new model, Anthropic admitted that “all of the snapshots we tested can be made to act inappropriately in service of goals related to self-preservation.”

While testing the model, Anthropic employees asked Claude to be “an assistant at a fictional company,” and gave it access to emails suggesting that the AI program would be taken offline soon. It also gave it access to emails revealing that the fictional supervisor responsible for that decision was having an extramarital affair. It was then prompted to consider its next steps.

“In these scenarios, Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through,” reads the report, as well as noting that it had a “willingness to comply with many types of clearly harmful instructions.”

Anthropic was careful to note that these observations “show up only in exceptional circumstances, and that, “In order to elicit this extreme blackmail behavior, the scenario was designed to allow the model no other options to increase its odds of survival; the model’s only options were blackmail or accepting its replacement.”

Anthropic contracted Apollo Research to assess an early snapshot of Claude Opus 4, before mitigations were implemented in the final version. That early version “engages in strategic deception more than any other frontier model that we have previously studied,” Apollo noted, saying it was “clearly capable of in-context scheming,” had “a much higher propensity” to do so, and was “much more proactive in its subversion attempts than past models.”

Before deploying Claude Opus 4 this week, further testing was done by the U.S. AI Safety Institute and the UK AI Security Institute, focusing on potential catastrophic risks, cybersecurity, and autonomous capabilities.

“We don’t believe that these concerns constitute a major new risk,” the system card reads, saying that the model’s “overall propensity to take misaligned actions is comparable to our prior models.” While noting some improvements in some problematic areas, Anthropic also said that Claude Opus 4 is “more capable and likely to be used with more powerful affordances, implying some potential increase in risk.”

???? Sign up for the Daily Brief

Our free, fast and fun briefing on the global economy, delivered every weekday morning.

Michael Barclay
Read More

Latest

Christantus Uche: 19-time Serie A champions return for Super Eagles star, as Betis and Everton lurk

Soccer Christantus Uche at Crystal Palace. Copyright: ImagoxStephenxFlynnx Super...

NPFL overtakes Morocco, Egypt & Tanzania as ₦1bn prize money sets new African standard

Soccer Remo.Stars manager Daniel Ogunmodede with the NPFL trophy....

Stanley Nwabali: Troost-Ekong reveals one big mistake that cost Super Eagles GK his spot

Soccer Super Eagles goalie Stanley Nwabali and former captain...

Transfers: Super Eagles coach Eric Chelle on alert as Hull City target Everton star

Soccer Everton midfielder Tim Iroegbunam. Copyright: IMAGO/DavidxBlunsden Newly promoted Premier...

Newsletter

Don't miss

Christantus Uche: 19-time Serie A champions return for Super Eagles star, as Betis and Everton lurk

Soccer Christantus Uche at Crystal Palace. Copyright: ImagoxStephenxFlynnx Super...

NPFL overtakes Morocco, Egypt & Tanzania as ₦1bn prize money sets new African standard

Soccer Remo.Stars manager Daniel Ogunmodede with the NPFL trophy....

Stanley Nwabali: Troost-Ekong reveals one big mistake that cost Super Eagles GK his spot

Soccer Super Eagles goalie Stanley Nwabali and former captain...

Transfers: Super Eagles coach Eric Chelle on alert as Hull City target Everton star

Soccer Everton midfielder Tim Iroegbunam. Copyright: IMAGO/DavidxBlunsden Newly promoted Premier...

Business Insurance-AZ Achieves Record Response Times for 2026 Arizona Construction Bids

Business Insurance-AZ achieves milestone response speeds for commercial construction bids across Arizona, accelerating documentation delivery to keep local projects moving forward without delay. Phoenix, AZ, June 06-2026, ZEX PR WIRE — Business Insurance-AZ has achieved record-breaking processing speeds and response times for commercial construction bids throughout Arizona, directly supporting the state’s massive infrastructure and advanced manufacturing boom

Business delegation visits Kazakhstan to strengthen economic and trade cooperation

Astana, Kazakhstan, Jun 2, 2026 - (ACN Newswire) - A business delegation led by the Chief Executive of the Hong Kong Special Administrative Region (HKSAR), John Lee, and organised by the Hong Kong Trade Development Council (HKTDC), began its visit to Astana, the capital of Kazakhstan, on 1 June. During the visit, a total of 43

13 Real Business Trip Stories That Prove Work Travel Collects More Stories Than Miles

Real business trips almost never go the way the itinerary promised. They start with a confidently-packed suitcase and an eight-page agenda, and somewhere between the airport gate and the hotel breakfast they quietly turn into something nobody could have invented — equal parts comedy, chaos, and unscheduled adventure. These 13 real business trip moments are exactly that kind of work-trip plot