Did China launch a world-first AI-powered cyberattack ?

A Chinese state-backed espionage agency allegedly used its Claude artificial intelligence (AI) to automate the majority of a cyberattack campaign, according to Anthropic researchers. However, the story has caused equal parts fear and skepticism. The cybersecurity industry is working to clarify what exactly occurred and how autonomous the model was in light of the study.

According to a statement released by the company on November 13, engineers interfered with what they characterize as a “largely autonomous” operation that employed the large language model (LLM) to plan and carry out around 80–90% of a comprehensive reconnaissance and exploitation effort targeting 30 companies globally.

Engineers claim to have found a number of efforts at product abuse that eventually led to operators connected to a Chinese state-sponsored espionage organization. According to reports, the attackers directed Anthropic’s Claude Code model on targets in the IT, financial, and government sectors, assigning it tasks including data exfiltration, credential harvesting, exploit creation, vulnerability analysis, and reconnaissance. The statement claimed that only “high-level decision-making,” including selecting targets and determining when to retrieve stolen data, had human intervention.

Engineers subsequently stopped the campaign internally using monitoring and abuse detection strategies that detected strange patterns indicating automated task-chaining. According to company executives, the attackers sought to overcome the model’s guardrails by breaking malevolent aims down into smaller parts and disguising them as benign penetration-testing jobs, a technique known as “task decomposition.” In some cases presented by Anthropic, the model attempted to follow out instructions but generated mistakes, such as hallucinated discoveries and clearly bogus credentials.

An AI or human-driven attack?

The company’s story is striking: it describes this as a “first-of-its-kind” instance of AI-orchestrated espionage, with the model essentially directing the attack. However, not everyone agrees that the autonomy was as significant as Anthropic implies.

Mike Wilkes, an adjunct professor at NYU and Columbia University, told Live Science that while the attacks themselves appear simple, the orchestration is what makes them unique.

“The assaults themselves are harmless and unthreatening. The orchestration component is mostly self-driven by the AI, which is frightening, according to Wilkes. “The story is reversed between AI-augmented human assaults and human-augmented AI. Consider this only a “hello world” illustration of the idea. People who discount the attacks content are unable to see the “leveling up” that this entails.

The operation may not have actually hit the 90% automation figure that Anthropic reps emphasized, according to some experts.

Many aspects of the tale seem believable, but they are probably still exaggerated, according to Seun Ajao, senior lecturer in data science and AI at Manchester Metropolitan University.

He told Live Science that state-sponsored organizations have been using automation in their operations for years, and that LLMs can already build scripts, scan infrastructure, and summarize vulnerabilities. Anthropic’s description has “details which ring true,” he noted, such as the use of “task decomposition” to evade model protections, the necessity to rectify the AI’s hallucinated conclusions, and the fact that only a fraction of targets were compromised.

“Even if the autonomy of the said attack was overstated, there should be cause for concern,” he contended, pointing to scalability, the governance issues of monitoring and auditing model use, and reduced barriers to cyber espionage using commercial AI technologies.

Professor of cybersecurity at the University of St. Gallen Katerina Mitrokotsa is likewise dubious about the high-autonomy concept. According to her, the incident appears to be “a hybrid model” in which an AI is operating as an orchestration engine under human supervision. Anthropic presents the assault as an end-to-end AI-orchestrated attack, however Mitrokotsa points out that the attackers seem to have gotten around security measures mostly by dissecting malevolent jobs into smaller parts and arranging them as valid penetration testing.

Then, while humans supervised crucial choices, the AI carried out network mapping, vulnerability scanning, exploit creation, and credential gathering, she claimed.

She finds the 90% number difficult to accept. “AI can speed up repetitive activities, but it is still challenging to connect complicated attack phases without human confirmation. According to reports, Claude made mistakes that needed to be manually corrected, such hallucinating credentials. This is more in line with sophisticated automation than actual autonomy; scripting and current frameworks might produce comparable efficiencies.

Lowering the barrier of entry for cybercrime

Most analysts believe that the incident’s relevance does not depend on whether Claude was performing 50% or 90% of the labor. The disturbing issue is that even partial AI-driven orchestration reduces the entrance barrier for espionage organizations, increases campaign scalability, and blurs accountability when an LLM becomes the motor that binds an incursion together.

If Anthropic’s description of events is correct, the ramifications are significant, since attackers may utilize consumer-facing AI technologies to expedite reconnaissance, shorten the time from scanning to exploitation, and repeat assaults quicker than defenders can respond.

However, this reality is not very consoling if the autonomy story is overstated. According to Ajao, publically accessible off-the-shelf AI tools have significantly reduced barriers to cyber espionage. Additionally, “AI-driven automation [could] reshape the threat landscape faster than our current defenses can adapt,” according to Mitrokotsa.

According to the experts, the most plausible scenario is that this was a human-led operation enhanced by an AI model working as a relentless assistant, piecing together reconnaissance duties, creating vulnerabilities, and producing code at scale rather than a completely autonomous AI attack. The assault demonstrated that attackers are beginning to view AI as an orchestration layer, and defenders should anticipate more hybrid operations in which LLMs enhance rather than replace human capabilities.

The underlying message from experts is the same regardless of whether the exact percentage was 80%, 50%, or much lower: Anthropic engineers may have spotted this one early, but it could be more difficult to stop the next one.

Source link