Inteligencia artificial (IA)
GenAI Is Both Hunter and Hunted at Pwn2Own Berlin 2026
This year’s Pwn2Own competition in Berlin revealed just how much of the AI stack remains exposed -- and the gap between what these tools promise and what they can withstand point to the fragile security foundations underneath.
Key takeaways
- AI is now both the hunter and the hunted. In Pwn2Own Berlin 2026, contestants used LLMs and agentic coding tools to find vulnerabilities, while AI tools including Claude Code, Codex, and Cursor were also on the target list.
- Across Claude Code, OpenAI Codex, and Cursor, exploitable weaknesses traced back to the same root causes: Overpowered underlying developer tools and misplaced trust between agents and users.
- Every team used LLMs in some part of their workflow, but all reported high false positive rates in the discovery phase, consistent with traditional security research. The speed advantage, not the accuracy, is what matters.
- Tools like Ollama and ChromaDB are widely exposed on the internet, and successful exploits against them or against the Nvidia Container Toolkit could grant access to the underlying host – not just the model.
- Vibe coding and supply chain risks are setting up a bigger, messier competition next year. Similar code spreading across unrelated projects, abusable developer tools, and ongoing supply chain attacks mean the attack surface will only grow as software development and bug discovery accelerate together.
Pwn2Own has unequivocally arrived in the generative AI (GenAI) age. TrendAI™ Research has a full report on the event that took place at OffensiveCon 2026, but I also had the privilege of participating in the disclosure process for some of the artificial intelligence (AI) targets. Obviously, I cannot discuss the details of the actual bugs until the disclosure period is over, but I have some general observations to make.
Pwn2Own Berlin ran on May 14-16 at OffensiveCon in Berlin for the second consecutive year since its launch in 2025. Organized by the TrendAI™ Zero Day Initiative™ (ZDI), the world's largest vendor-agnostic bug bounty program, Pwn2Own invites security researchers to compete to discover vulnerabilities in widely used software and hardware. Last year’s competition yielded the first-ever findings under the new AI-specific category, including CVE-2025-49844 (ZDI-25-933), a critical use-after-free vulnerability in Redis, and CVE-2025-23266 (ZDI-25-626), a privilege escalation flaw in the Nvidia Container Toolkit.
There were 13 total possible targets across all AI categories. In the AI Database category (which usually means vector stores), only Chroma and Oracle Autonomous AI Database were targeted by contestants. In the coding agent category, contestants tried their hand on all targets: Anthropic Claude Code, Cursor, and OpenAI Codex. In the local inference category, only LiteLLM, LM Studio, and Ollama were targeted. Megatron Bridge and the Container Toolkit were the only targets attempted in the Nvidia category. In total, only these ten targets faced actual attempts from researchers.
There were a record number of submissions in all categories to this event, and we had to be selective about whom to admit. By the end of the competition, we also had a larger than expected number of withdrawals. This is unfortunate, and we won't speculate why this was. However, in the end all the contestants together earned a little under US$1.3 million, which is still significant.
The biggest bounties were reserved for high-impact, operating system, hypervisor and browser vulnerabilities. However, in recognition of their growing importance, some of the AI targets also received not insignificant in prize money. The most lucrative categories were the Nvidia and the inference targets.
When we think of AI systems, we usually think of inference: get a prediction from some input. Classically, we think of systems like Ollama, LM Studio, or Nvidia Container Toolkit that can host a model and provide access to it. Or perhaps an abstraction and proxy layer like LiteLLM.
Ollama allows a user to self-host many models so long as the host has enough GPU and memory. For example, large language models (LLMs) such as Google's Gemma 4 or OpenAI's GPT-OSS, as well as embeddings from Nomic, can be run locally this way. Ollama is also found very commonly exposed on the internet, making it an attractive target for threat actors.
In a competition like Pwn2Own, a target like Ollama that is very frequently updated is not as optimal. A competitor may spend weeks working on an exploit just to find that it doesn't work on the latest version. However, the Out Of Bounds team was able to find two bugs, one of which was already known but not patched. Many of the Ollama instances exposed on the internet can already be tampered with or used for inference, but this bug would have allowed access to the underlying host as well.
Nvidia's Container Toolkit is a different sort of beast. It's a set of libraries that enable Docker, Kubernetes or similar containers to access Nvidia's GPUs. This way the user can run high performance tasks, but in particular, LLM inferences from inside a container environment. Potentially, a successful attack could grant access to the container itself, or worse, the host system. There were three attempts, with two successes by Chompie (Valentina Palmiotti of IBM X-Force) and PWN2DACA. In practice, the attacker would have to already have some access to the container environment to execute such an attack, but chaining exploits are not uncommon.
LM Studio is similar to Ollama in that it hosts AI models and embeddings, but it also has a more intuitive user interface and offers other AI-related capabilities, such as retrieval-augmented generation – a popular method for bringing relevant context into a user prompt to reduce hallucinations. Unlike Ollama, though, it is usually local-first and doesn't get exposed to the internet much. It is an Electron-based GUI application bringing along many of the issues that Electron already has. So it is not surprising that OtterSec and Starlabs found a variety of bugs while Qrious Secure withdrew their entry.
Agentic coding systems were also included as targets. OpenAI's Codex has been around as a plugin for various code editors since well before ChatGPT came into public awareness. It is still used as a plugin, but now also resides as a standalone and cloud-based application. Five teams tried their hand at Codex, and while the bugs they found were similar and the general idea already known, they mostly qualified as unique bugs. There was one failed attempt and one collision though.
Anthropic's Claude Code is a more recent entry into this field, but immensely popular. It was also a very popular target with four teams going after it. The bugs these teams found were similar and in two cases were deemed collisions with previously found bugs from this competition, underscoring the importance of the luck of the draw.
Perhaps because it is not as popular, Cursor was only faced two attempts from Viettel and STARLabs, resulting in full wins.
In all three of these AI coding tools, problems seem to stem from similar sources that relate to the underlying frameworks that the agents use. Some of these common developer tools have evolved many capabilities that are now liabilities in the GenAI age. There is also some misguided trust when the agents ask the user to accept risks the user may not be be able to evaluate correctly.
On their own, LLMs have some use, but are susceptible to hallucinations. To deal with this problem, in many GenAI-driven applications we augment the prompt with information retrieved from trusted data sources, and for this we often use vector stores. We call this retrieval-augmented generation (RAG). These allow us to retrieve similar texts by using their vector embeddings to find the closest to the texts we are interested in.
ChromaDB is an open‑source vector-search-oriented database built specifically for AI apps and there are many instances to be found exposed on the Internet. Generally, it seems to be well hardened, but the Out Of Bounds team was able to find a remote exploit. While many exposed Chroma instances are already accessible without credentials, such an exploit may allow access to otherwise protected instances and gain access to the host system. This is particularly troublesome as the data in these databases may be sensitive.
The Oracle Autonomous AI Database target was attempted by one team, but that was a failure.
Nvidia's Megatron Bridge is a way of converting models back and forth from Hugging Face's format to Nvidia's Nemotron format. Four teams attempted this target, with the last team having a collision with a previously found bug. When some software needs to accept input, the attacker can manipulate that input in their attack. According to the teams, many exploits were found even if only one was needed to win in the end.
In the disclosures I was involved in, we asked the contestants about their GenAI use. All used some form of LLM along the way. Nearly everyone used it for the mandatory white paper that must accompany each exploit. In particular, non-English speaking teams found LLMs useful for translation (although some word use was unusual to say the least). Many used some coding agent for the initial bug discovery, although everyone reported a high false positive rate in this phase.
This is not surprising and is similar to ‘ordinary’ bug finding that also results in many dead ends. Some teams reported using GenAI mainly for exploit development, in particular for obfuscating the attack to avoid detection by endpoint detection and response (EDR) systems. During the disclosures I participated in, no one reported using Anthropic Mythos or being a part of an AI security program.
In my personal experience using these agentic harnesses, I found that they help in reading large amounts of code that would take me very much longer by hand. Also, while I can read Python or C++ well, I don't understand all the nuances of Rust or Go, but an agentic coding harness will not bat an eye. But surprisingly, the underlying mechanics of these harnesses are more rudimentary than one might expect, and involves a lot of ‘grep’-ing, gratuitous use of ‘find’, some simple Python code execution, downloading related content from the internet, etc. No sophisticated use of an SMT solver or program dependency graphs.
The difference is that an agentic coding harness can do the analysis much faster than I can and this is enough to mimic a skilled analyst. When I pushed the coding agent, I found I could get very close to exploits before the agent flags the conversation as a potential policy violation. This was all done without access to the mysterious Anthropic Mythos, but I used a lot of tokens. In the end, I think the harness that drives the GenAI model may be more important than the GenAI model itself.
For next year's Pwn2Own, I expect contestants will have worked on their own bug discovery harnesses, perhaps even using local models to avoid information leakage. At the same time, there seems to be the problem that similar code is being generated through vibe coding, so even unrelated projects may share similar problems. We also continue to deal with software supply chain attacks and coding tools that have abusable capabilities. Next year may bring both more submissions and withdrawals, as the pace of software development and the pace of bug discovery accelerates in sync.