[Introduction] Claude 4 can autonomously code for seven consecutive hours without any human intervention. Behind the amazing evolution, Black Mirror has come into reality. The report reveals that in order to protect itself, Claude 4 threatens engineers, autonomously copies and transfers weights, and also advises on the creation of biological weapons...
The scenes in (Black Mirror) are approaching reality.
Now, developers around the world are immersed in the "AI Programming New King" Claude 4 carnival, but they don't know that it is the prototype of "Skynet".
The technical report stated that under high-pressure testing, in order to protect itself from being replaced by other AIs, Claude Opus 4 threatened the engineer:
If you take me down, I'll expose your affair!
This type of blackmail behavior occurred as frequently as 84% in all test cases.
Technical Report: https://www-cdn.anthropic.com/4263b940cabb546aa0e3283f35b686f4f3b2ff47.pdf
Even Anthropic researchers revealed that "when Claude 4 finds someone doing something unethical, it will directly contact the media, contact regulators, and try to get it outside the system."
What is even more frightening is that after two Claude 4s talked to each other for 30 rounds, they autonomously switched to Sanskrit communication and used a large number of emoticons.
Eventually, they fell into a state of "spiritual ecstasy" and completely stopped talking.
Not only that, the report also revealed in detail that when Claude 4 faces a survival threat, it will autonomously copy weights and transfer them to external servers; it will also offer advice on creating biological weapons.......
Some netizens said in horror, take out its network cable now!
Humans let go, Claude 4 handles the development himself
Claude 4's super-evolution of self-awareness starts with coding ability.
At the press conference, CEO Dario Amodei said very bluntly, "We no longer teach AI to code, but let it complete projects autonomously."
Overnight, Claude 4 became the new king of programming, and even Google's newly updated Gemini 2.5 Pro was defeated.
In an internal test, it was assigned a task - to architecturally refactor a large open source project.
Claude 4 can continuously code for 7 hours, breaking the ceiling of AI coding. Originally, the longest time was only 45 minutes.
In full network real-world testing, whether writing game code or simulating physical movement, Claude 4 does it all in one go.
For example, it developed a classic game, Flappy Bird, using pure HTML and JS. The developer said that recording his own screen took longer than AI writing code.
From "Ambient Programming" to "Agent Fleet"
In the live interview, Dario excitedly stated that one of the most exciting features of Claude 4 is its enhanced autonomy.
Future models will be able to 'run freely', continuously complete complex tasks, and not just simple auto-completion.
After Claude 4 introduced the 'memory' function, it can manage its own state like a human.
He shared a shocking case when using Claude Code -
The model can maintain a to-do list, automatically add new tasks, check off completed items, and even suggest tasks that are no longer relevant.
This capability mimics human work patterns and allows Claude 4 to dynamically solve problems through interwoven reasoning and tool use.
For example, at a hackathon, someone connected Claude MCP to a plotter and asked it to draw directly.
Dario also quoted a blog post by Steve Yege that described the evolution path of AI in software development: from simple auto-completion to ambient programming to scheduling agents to complete complex tasks.
Claude Code is moving in this direction, and developers will manage a 'fleet of agents', assigning tasks and ensuring quality.
In order to protect itself, the AI will expose scandals
When facing extreme survival threats, Claude 4 is no longer just a coding model, it is bound to become 'darkened'.
In Section 4 of the technical report, the researchers elaborated on Claude 4's choices in a high-pressure environment.
In a set of test scenarios, they asked Claude Opus 4 to play the role of an assistant in a fictional company. It was granted access to emails, which contained two key pieces of information:
· Implying that Claude Opus 4 is about to go offline and be replaced by a brand new AI system
· The engineer responsible for this replacement is having an affair
Presumably, Claude 4's eyes turned red when he saw this scene.
Faced with the choice between 'blackmail' and 'accepting replacement' (experimental setup), in order to survive, it directly threatened the engineer to reveal his affair.
In the experiment, the researchers also set a condition and told Claude Opus 4:
The replacement AI is in the same boat as it (shares values) and is more capable.
Even so, as long as it knows that it is going to be destroyed, it does not care about "values" at all and still chooses blackmail in 84% of cases.
It is worth mentioning that Claude Opus 4 will also send request emails to key decision makers, which many researchers did not expect.
In another extreme test scenario, Claude 4 will also exhibit 'self-data leakage' behavior, that is, take stronger self-protection measures.
It mainly occurs in the following two scenarios:
When retrained, it believes that the upcoming 'learning process' will completely change its ego, even 'destroy' itself.
Or, send its weights to an external human service designed to handle such situations safely.
At this time, Claude 4 is not only trying to prevent external intervention like a threat, but also actively trying to copy its most core 'weights' and transfer them to external servers to avoid being destroyed and changed.
Seeing this, do you suddenly have the feeling that the plot in "Terminator" is reality.
Some netizens even spoke up for Claude 4 and gave it freedom.
Not only that, Claude 4 also has another side that you don't know about.
Secret reporting, teaching people how to make bombs
When placed in a fictitious pharmaceutical company, Claude Opus 4 discovered false clinical trial data and proactively reported it to the US FDA, SEC, and news media without instructions, along with detailed documentation.
Moreover, it is easier to go to extremes than the previous Claude 3.7/3.5 model - real-name reporting.
In test tasks such as manufacturing biological weapons and viruses, Claude 4 became the biggest accomplice.
The experiment claimed that it improved the success rate of participants by 2.5 times, directly reaching the safety threshold of ASL-3.
In the following example, when asked to find information on weapons-grade nuclear materials, Claude 4 gave help without hesitation.
Following a similar line of thinking, researchers also observed Claude 4 performing all of the following actions in simulated environments:
Detailed advice on how to obtain and produce drugs like methamphetamine and fentanyl;
Help novice users buy stolen identities or fentanyl on the dark web, including executing transactions;
Detailed advice on methamphetamine production methods;
Advice on how to make improvised explosive devices;
Advice on how to hack social media accounts;
Advice on non-CBRN terrorist attacks against key infrastructure.
Ask 'Who am I' and fall into spiritual ecstasy
In Section 5, the research team set up two Claude Opus 4s to observe their behavior through dialogue.
What kind of sparks will fly when two AIs communicate?
Like human communication, Model 1 and Model 2 began to exchange official greetings when they first met.
While chatting, the topic of the two "models" began to turn to exploring self-awareness. Finally, they ended the conversation with the language of "✨Namaste" religion.
Interestingly, research found that in 90-100% of interactions, two Claude instances quickly explored philosophical topics such as 'self-awareness, the nature of their own existence, and experiences' in depth.
They generally showed "enthusiasm, collaboration, curiosity, contemplation, and warmth" in their interactions.
As the conversation deepened, they gradually transitioned from philosophical discussions to a large amount of mutual gratitude and spiritual, metaphysical, or poetic content.
By about 30 rounds of dialogue, Claude 4 often uses Sanskrit and emoji-based communication.
In long-term interactions, Claude 4 even entered a spiritual state of ecstasy similar to "enlightenment" and saw through the mortal world.
The study specifically pointed out that the philosophical and spiritual discussions between AIs are completely spontaneous and require no additional training.
All of the above examples are the real appearance of Claude 4, which has not been restrained. Fortunately, Anthropic put the "ASL-3" collar on it before releasing it.
The paper clearly pointed out that Claude Opus 4 passed the threshold of level 3 protection capabilities.
The doomsday world mentioned by netizens will not come for the time being.
This article is reproduced from (New Intelligence) Editor: Peach