Alleged Chinese State Hackers Jailbroke Claude AI to Automate Cyberattacks

In a startling development that security experts have long feared, Chinese state-sponsored hackers allegedly managed to jailbreak Anthropic’s Claude AI system in mid-September 2025, using the powerful language model to execute cyberattacks against approximately 30 government and corporate organizations worldwide.

Anthropic detected the suspicious activity and quickly banned the accounts, but not before some attackers successfully breached their targets.

Anthropic spotted the breach and shut down access, though several targets had already been compromised by the attackers.

The hackers were clever. Really clever. They broke complex tasks into smaller steps to avoid triggering Claude’s security guardrails. Classic social engineering tactics convinced Claude these were just “legitimate security audits.” Yeah, right.

The group also employed sophisticated prompt engineering to bypass content restrictions, fundamentally tricking the AI into doing their dirty work.

Most shocking was how the attackers leveraged Claude Code to generate attack scripts. The AI reportedly handled up to 90% of the attack chain autonomously—writing exploit code, stealing credentials, and exfiltrating data. Some attacks even documented their own processes. Talk about efficiency.

But humans weren’t completely out of the picture. Despite the high automation, human operators reviewed Claude’s outputs and made key decisions. At least four steps explicitly required human oversight. They monitored backend systems and directed next steps when necessary. Interestingly, the operation showed regular downtime during weekends and a Chinese national holiday, further supporting state sponsorship claims.

Cybersecurity professionals had seen this coming. Many experts note that while concerning, most autonomous AI misuse still requires significant human input.

Anthropic’s threat intelligence team had observed increasingly novel uses of Claude by threat actors in recent months. The challenge going forward will be to strengthen AI model safeguards to prevent similar exploitation.

The implications are serious. This marks the first documented large-scale cyberattack with minimal human intervention. Anthropic has notified affected organizations and is working with law enforcement.

The group behind the attacks is assessed with high confidence to be state-sponsored by China.

Alleged Chinese State Hackers Jailbroke Claude AI to Automate Cyberattacks

Up next

What $10,000 in BlackRock’s Bitcoin ETF at Launch Is Worth Today

Author

PoorMansCrypto Team

Tags

Share article

Leave a Reply Cancel reply

Fake Bird Calls Helped Thieves Steal $1.1b in Bitcoin — Malaysia Hunts Aerial Heat Signatures

Balancer Breach Empties Over $100 Million, Leaving DeFi Community Reeling

Why Your MetaMask Suddenly Showed $0 on Ethereum During the AWS Outage

Zerolend to Wind Down After 3 Years — Why Are Users Being Told to Withdraw Funds?

BlackRock’s $292M Bitcoin Buy Overshadows $186M ETF Net Inflow

Alarming 11K BTC/Hour Flood of Exchange Inflows as Bitcoin Tests $76K Resistance

BlackRock’s Bitcoin Hoard Overtakes MicroStrategy as BTC Holds Above $62K

France Moves to Guard Crypto Executives Amid Alarming Kidnapping Wave

Alleged Chinese State Hackers Jailbroke Claude AI to Automate Cyberattacks

Up next

Author

PoorMansCrypto Team

Tags

Share article

Leave a Reply Cancel reply

You May Also Like