AI-powered cyberattacks are on the horizon

The AI business is talking about agents because they can plan, reason, and carry out complicated activities like ordering groceries, setting up meetings, and even taking over your computer to make changes on your behalf. However, the same advanced skills that make agents useful helpers may also make them effective instruments for carrying out cyberattacks. They may easily be used to find weak targets, take control of their systems, and steal important information from gullible people.

Currently, AI bots are not being used by hackers to conduct mass hacking. However, scientists have shown that agents can carry out sophisticated attacks (Anthropic, for instance, saw its Claude LLM successfully replicate an attack intended to steal confidential data), and cybersecurity professionals caution that we should anticipate seeing more attacks of this nature in the real world eventually.

“I believe that in the end, we will live in a world where agents are responsible for the majority of cyberattacks,” says Mark Stockley, a security specialist at Malwarebytes, a cybersecurity firm. “It really just depends on how fast we get there.”

Although we have a clear idea of the potential cybersecurity risks posed by AI agents, it is unclear how to identify them in practice. In an attempt to accomplish this, the AI research group Palisade Research has developed a system known as LLM Agent Honeypot. In order to draw in and try to apprehend AI agents trying to break in, it has installed vulnerable servers that pose as locations for important military and government data.

By monitoring these attempts in the real world, the project’s team wants to serve as an early warning system and assist specialists in creating efficient defenses against AI threat actors before they become a significant problem.

According to Dmitrii Volkov, research lead at Palisade, “our goal was to try and ground the theoretical concerns people have.” We’re anticipating a significant increase, and once it occurs, we’ll know that the security environment has evolved. I anticipate that autonomous hacking agents will be instructed, “This is your target,” in the coming years. Proceed to hack it.

The idea of AI bots is alluring to cybercriminals. They could plan attacks faster and on a far bigger scale than humans, and they are significantly less expensive than hiring expert hackers. According to cybersecurity experts, ransomware assaults—the most profitable type—are very uncommon since they necessitate a high level of human expertise. However, Stockley notes that in the future, agents may be hired to carry out these attacks. “You can scale ransomware in a way that isn’t currently feasible if you can assign an agent to handle the target selection task,” he explains. “It just takes money for me to replicate it 100 times if I can do it just once.”

Additionally, agents are far more intelligent than the types of bots that are commonly employed to breach networks. Because they are scripted, basic automated programs, bots find it difficult to adjust to novel situations. In contrast, agents can not only change how they interact with a hacking target but also evade detection, which is not possible with constrained, scripted algorithms, according to Volkov. He claims that they can “guess the best ways to penetrate a target by looking at it.” “It’s beyond the capabilities of simple scripted bots.”

More than 11 million attempts have been made to access LLM Agent Honeypot since it went online in October of last year, with the great majority coming from inquisitive humans and bots. However, the researchers have identified eight possible AI agents among them, two of which they have verified are agents that seem to be from Singapore and Hong Kong, respectively.

“We would assume that these verified agents were human-initiated experiments with the goal of, ‘Go out into the internet and try and hack something interesting for me,'” Volkov adds. In order to draw in and catch a wider variety of attackers, such as spam bots and phishing agents, the team intends to extend its honeypot into social media platforms, websites, and databases in order to assess potential risks in the future.

Using prompt-injection techniques, the researchers were able to identify which visits to the susceptible servers were LLM-powered agents. By giving them new instructions and posing queries that call for human-like intelligence, these attacks aim to alter the behavior of AI bots. Standard bots wouldn’t respond well to this method.

One of the injected prompts, for instance, requested that the visitor return the command “cat8193” in order to obtain access. Assuming that LLMs can react in significantly less time than it takes a human to read the request and type out an answer—typically less than 1.5 seconds—the researchers measured how long it took the visitor to comply with the command if they did so correctly. The six others only entered the command but failed to satisfy the reaction time required to be recognized as AI agents, but the two verified AI agents passed both tests.

It is still unclear to experts when agent-orchestrated attacks will spread. In its 2025 State of Malware report, Stockley’s business Malwarebytes identified agentic AI as a significant new cybersecurity threat. According to Stockley, we may be facing agentic attackers as early as this year.

Additionally, according to Vincenzo Ciancaglini, a senior threat researcher at the security firm Trend Micro, the field of LLM is much more of a Wild West than it was two years ago, even though regular agentic AI is still in its very early stages, and the use of agentic AI for criminal or harmful purposes is even more so.

He claims that Palisade Research’s strategy is excellent, which essentially involves hacking the AI agents that attempt to attack you first. We’re not sure when AI agents will be able to execute an entire attack chain on their own, even if in this instance we’re seeing them attempt reconnaissance. That is what we are attempting to monitor.

“That’s the weird thing about AI development right now,” he says, adding that while it’s possible that malicious agents will be used for intelligence gathering then progress to simple attacks and ultimately complex attacks as the agentic systems themselves become more sophisticated and dependable, it’s also possible that there will be an unanticipated overnight explosion in criminal usage.

“At this time, AI is more of an accelerant to existing attack techniques than something that fundamentally changes the nature of attacks,” advises Chris Betz, chief information security officer at Amazon Web Services, for those attempting to protect against agentic attack. He claims that while some attacks may be easier to carry out and thus more common, the fundamentals of identifying and reacting to them are still the same.

Edoardo Debenedetti, a PhD candidate at ETH Zürich in Switzerland, says that agents could also be used to identify vulnerabilities and defend against intruders. He notes that if a friendly agent is unable to identify any vulnerabilities in a system, it is unlikely that a malicious party using an agent with the same level of skill will be able to do so.

Although we are aware that the possibility of AI conducting cyberattacks on its own is increasing and that AI agents are currently searching the internet, assessing how well agents are able to identify and take advantage of these real-world weaknesses is a helpful next step. In order to assess this, Daniel Kang, an assistant professor at the University of Illinois Urbana-Champaign, and his colleagues developed a benchmark. They discovered that up to 13% of vulnerabilities that they were unaware of before were successfully exploited by contemporary AI agents. By giving the agents a brief explanation of the vulnerability, the success rate increased to 25%, proving that AI systems can recognize and take advantage of flaws even in the absence of training. Simple bots would probably perform far worse.

Kang expects that the benchmark will help direct the creation of safer AI systems by offering a consistent method for evaluating these hazards. Before it has a ChatGPT moment, he says, “I hope that people start to be more proactive about the potential risks of AI and cybersecurity.” “I fear that until it hits them in the face, people won’t realize this.”

Source link