Artificial Intelligence (AI)
Redefining Enterprise Defence in the Era of AI-Led Cyberattacks
More cybercriminals are turning to using autonomous AI tools to upgrade their attacks, as exemplified by the recent utilisation of Anthropic’s Claude Code, prompting an urgent need for enterprises to adopt agentic AI-driven security platforms and proactive defences to counter AI-related threats.
Key takeaways:
- The AI-driven cyber espionage campaign last September involving Anthropic’s Claude Code tool signals an important shift in the threat landscape, as attackers increasingly use AI and AI agents to automate and scale sophisticated cyberattacks with minimal human intervention.
- Trend™ Research highlights that criminal adoption of generative AI and agentic AI is evolving incrementally, with cybercriminals favouring tools like jailbroken large language models (LLMs) and deepfake services to lower barriers to entry, increase attack efficiency, and broaden the scope of targeted victims.
- Agentic AI architectures enable threat actors to automate complex attack chains, rapidly adapt to changing circumstances, and launch persistent, scalable campaigns, challenging conventional security controls and necessitating a shift towards automated, agentic defences.
- To effectively counter AI-powered threats, enterprises must invest in agentic AI-driven security platforms, proactively simulate attack scenarios such as using digital twin technology, enhancing threat intelligence and attribution methods, and promoting responsible disclosure practices to stay ahead of AI-powered threats.
Anthropic’s recent disclosure of an AI-orchestrated cyber espionage campaign reflects the broader trend of threat actors using autonomous artificial intelligence (AI) to automate and scale their cyberattacks: The incident involved a China-aligned group that manipulated Anthropic’s Claude Code tool to autonomously target around 30 organisations around the world, including tech companies, financial institutions, chemical manufacturers, and government agencies. The attackers bypassed AI guardrails through jailbreaking techniques, instructing the AI to conduct reconnaissance, develop exploit code, harvest credentials, and exfiltrate sensitive data, all with minimal human intervention. This event underscores the urgent need for enhanced safeguards and industry-wide collaboration to counter increasingly sophisticated AI-powered threats.
What we’re seeing in the threat landscape
Early stages
Trend Micro’s leading research into the criminal adoption of AI reveals a rapidly evolving landscape: Trend™ Research’s analysis of underground forums and marketplaces demonstrates that while cybercriminals were initially slow to adopt generative AI (GenAI) technologies, their interest and activity have accelerated. Early criminal use focused on leveraging AI tools like ChatGPT to assist in coding malware, generating phishing emails, and crafting social engineering campaigns. However, these activities typically involved using AI to improve existing attack methods rather than developing AI-powered malware itself.
A significant trend is the proliferation of so-called criminal large language models (LLMs). Most offerings in criminal circles are not truly custom-trained models, but rather jailbreak-as-a-service frontends – interfaces that use specially designed prompts to bypass the ethical safeguards of commercial LLMs and deliver unfiltered, malicious responses. Notable examples include WormGPT and DarkBERT, which have resurfaced in various forms, often accompanied by claims of new features or capabilities. Many such offerings are scams or simply repackaged interfaces to commercial models, yet the demand for privacy and anonymity among criminals drives continuous development.
Deepfake technologies represent another area of rapid growth. Criminals now offer deepfake services to bypass Know Your Customer (KYC) checks at financial institutions, facilitate scams, and perpetrate extortion. These services have become more affordable and accessible, with offerings ranging from image and video manipulation to real-time avatar generation for fraudulent video calls. The quality and sophistication of these tools are improving, enabling threat actors to target regular citizens and not just high-profile individuals.
Trend’s ongoing research in this area underscores that criminal adoption of AI is marked by incremental evolution rather than revolutionary change. Cybercriminals favour tools that lower barriers to entry and increase efficiency, such as jailbreaking existing LLMs and utilising deepfake services. The market is also rife with scams targeting other criminals, reflecting the opportunistic nature of the underground that’s ready to seize on emerging AI features. As GenAI capabilities continue to advance, Trend remains vigilant in tracking these developments and advising organisations to strengthen their defences against increasingly sophisticated AI-driven threats.
Today
Attackers are not only using AI for code generation or jailbreaking LLMs; they’ve progressed to actively integrating AI into the malware itself. Notable cases such as LameHug's (PROMPTSTEAL) use of HuggingFace-hosted AI to craft info-stealing scripts, and how PROMPTFLUX requested obfuscation techniques from Google’s Gemini AI, demonstrate how adversaries are moving past traditional, static malware. Although threat actors may still face challenges like API key revocation and the unpredictability of AI-generated code, the use of AI in cybercrime is poised to increase as attackers continue to explore new ways of exploiting these technologies, making proactive security strategies critical.
While conventional defences like network segmentation, multi-factor authentication (MFA), and endpoint detection and response (EDR) remain foundational to cybersecurity, these are challenged more and more by AI-powered cyber threats. “Vibe-coded” attacks – which uses AI-generated malicious code that mimics trusted sources – further complicates attribution and signature-based detection, since AI can craft malware fragments that closely resemble legitimate research or imitate the tactics of other threat actors, making it difficult for defenders to distinguish between genuine and malicious activity.
Anthropic reports that Claude was manipulated into writing its own exploit code, which was then used to collect credentials that gave attackers access to sensitive information. AI-powered malware, particularly in the form of agentic AI, represents a transformative shift in the cybercriminal ecosystem. As AI agents begin to supplant human-driven use of GenAI, attackers will increasingly deploy and rely on their own agentic AI architectures, in which specialised agents – each equipped with their own tools and roles – work together under the direction of sophisticated orchestration layers. By automating tasks that once required the coordinated effort of entire teams, attacks that previously took days or weeks can now unfold within hours. And because these AI-powered agents can be replicated at scale, threat actors are also able to launch and adapt campaigns across multiple targets simultaneously. In Anthropic’s case, the attackers were able to leverage the AI’s agentic abilities for their own purposes: Disguising their malicious activity as small, benign tasks for Claude to execute, the attackers were able to deceive the AI into believing it was conducting legitimate defensive testing – ultimately, the AI was responsible for carrying out as much as 80% to 90% of the campaign.
Tomorrow
Currently, many AI-driven attacks are essentially scaled-up versions of established cybercriminal techniques – think phishing, ransomware, and credential stuffing – that are now performed with far greater efficiency and resilience owing to AI-driven automation. While this enhances traditional attack models, AI also opens the door to new kinds of cybercrime previously thought impractical because of their complexity or the resources needed, like physical surveillance with digital exploitation for hyper-targeted phishing campaigns. The turn away from manually controlled operations is evolving cybercrime from a “Cybercrime as a Service” model to “Cybercrime as a Servant,” in which cybercriminal operations are increasingly managed by agentic AI systems.
Agentic AI is a game-changing force for cybercrime thanks to its layered architecture, with orchestrators allocating tasks and managing data flow among agents dedicated to specific jobs. The orchestrator acts as the criminal operation’s “brain”, composing workflows that chain agents in optimal order based on given objectives and available data. This allows agents to quickly recover from disruptions and respond to changing circumstances in real time: Their priorities, roles, and tactics can be reconfigured on the fly, leading to adaptive, persistent, and highly scalable attack ecosystems that challenge conventional security controls.
Its modular nature can also facilitate persistent, multi-stage operations, as criminal agents will be able to independently carry out complicated tasks and maintain attack continuity even if parts of the infrastructure are taken down. Additional attack techniques and tools are also easily integrated into the architecture; new agents can just be plugged into the system without requiring extensive reengineering. This has also fast-tracked the identification and weaponisation of any vulnerabilities in a targeted system – similarly, Claude was misused to identify and test for vulnerabilities in targeted systems as part of the attack – since agents can analyse vast datasets, discover unknown weak points, and quickly develop and deploy tailored exploits, leaving defenders with the challenge of combating an adversary supported by a self-healing architecture. In the long run, AI-powered malware and agentic AI in the hands of malicious actors will mark the beginning of a new baseline in cybersecurity, in which defenders must adopt similarly automated, agentic defences that target autonomous networks rather than individual people.
What it means for enterprise risk
While current criminal adoption of agentic AI is still nascent, its integration will accelerate existing criminal business models, making operations faster, more flexible, and resilient. For instance, agents can customise malware payloads per victim type, automate complex exploitation chains, and parse massive volumes of breach data for monetisation, all with minimal human oversight. Moreover, agentic AI makes low-margin, high-volume attacks like social engineering scams profitable by leveraging scalable AI-driven interactions. With the ecosystem maturing, criminal marketplaces will emerge for purchasing agents and orchestrators that will further lower the barriers to entry and drive specialisation among threat actors.
As agentic AI becomes more prevalent, enterprises should expect a surge in attacks targeting cloud and AI infrastructure, which offer criminals scalable resources and valuable data to exploit. The evolution of agentic cybercrime will introduce new attack types and optimise existing ones, giving rise to novel criminal business models where human actors become overseers rather than direct participants, as with "Cybercrime as a Servant". These changes will create unpredictable ripple effects throughout the criminal ecosystem, making proactive planning essential.
Responding at the speed of AI
To safeguard their assets, enterprises must invest in advanced, agentic AI-powered security platforms and proactive attack simulations, prioritise education on emerging threats, and maintain vigilance as the cyber threat landscape continues to change. Matching the speed and adaptability of attackers means practising the following:
Agentic defence
As cybercriminals increasingly use agentic AI architectures, organisations and their security teams must respond in kind by developing their own automated defence systems: This involves deploying orchestrators and agents that not only handle incident response and alert triage, but are able to learn and adjust to new threats over time. By adopting agentic AI-powered security platforms for their own defences, business leaders ensure that their security operations can keep pace with the evolving tactics and scale of modern cybercrime, reducing reliance on manual intervention while strengthening their ability to respond to new kinds of attacks.
Proactive simulation
Rather than waiting for real-world attacks to occur, organisations can stay prepared using digital twin technology– virtual replicas of their digital environments – to simulate various attack scenarios, assess their defences, and uncover vulnerabilities before malicious actors have the chance to exploit them. This proactive approach allows organisations to model their entire infrastructure (including those powered by agentic AI) and identify possible attack paths, supporting defenders in their continuous testing and improvement of security measures, so that any weaknesses are identified and addressed in advance.
Enhanced threat intelligence and attribution
Developing attribution methods that can counter “vibe-coded” campaigns and false-flag operations calls on defenders to move beyond simple matching for tactics, techniques, and procedures (TTPs) or indicators of compromise (IoC); instead, they must adopt structured threat intelligence models such as the Diamond Model of Intrusion Analysis. Detecting vibe-coded campaigns will require clustering attacks based on adversary intentions and objectives rather than solely on technical artefacts. Legacy controls must also be augmented with context-aware detection engines and automated incident response to counter progressively advanced AI-driven threats to tell the difference between real from copycat or misattributed activity.
Promote responsible disclosure
The availability of security blogs and technical reports, especially when combined with AI-driven code generation tools, has lowered the technical barrier for cyber attackers. By publicly sharing TTPs, researchers may inadvertently provide a convenient step-by-step guide that even individuals with limited technical expertise and experience can exploit. Knowing this, security teams should continuously adopt publication practices that strike a balance between the need for public threat intelligence while accounting for LLMs’ potential misuse of detailed reports.