Open Models Challenge Anthropic's Cybersecurity AI Mythos

Overview

Anthropic has positioned its Claude Mythos model as a frontier capability in offensive AI, suggesting its ability to discover and exploit complex software vulnerabilities is unmatched. However, recent independent research suggests that this narrative of singular AI superiority is rapidly crumbling. Multiple studies are demonstrating that smaller, openly available models can successfully replicate the sophisticated vulnerability analyses and exploit chain development that Anthropic showcased in its controlled testing environments.

The hype surrounding Mythos—which has been limited to an eleven-organization consortium for testing—is based on impressive, but narrow, demonstrations. The findings from groups like AISLE and Vidoc Security are systematically poking holes in the exclusivity claim, proving that the foundational techniques for bug hunting and exploitation are not locked behind proprietary compute budgets.

The evidence points to a more fragmented and competitive landscape. While Mythos remains a powerful tool, the data suggests that advanced capabilities are becoming democratized, proving that specialized, smaller models can achieve parity on critical security tasks.

The Replication of High-Stakes Vulnerability Discovery

A family stands in digital blue light, symbolizing online privacy and security.

The Replication of High-Stakes Vulnerability Discovery

The initial focus of the Mythos narrative centered on its ability to autonomously discover and build working exploits against critical infrastructure. Anthropic highlighted Mythos's success in finding flaws in systems like FreeBSD, which was pitched as a benchmark for autonomous discovery.

Independent efforts have quickly matched this performance. AISLE, which has been running AI-assisted bug hunting on open-source code since mid-2025, reported finding 15 vulnerabilities in OpenSSL and five in curl, using a range of models to test the limits of open-source capability. When tested against the FreeBSD NFS bug (CVE-2026-4747), AISLE found that all eight models tested—including GPT-OSS-20b, a model with only 3.6 billion parameters—flagged the memory flaw as critical.

Furthermore, these smaller models did not just identify the flaw; they provided plausible exploitation paths. GPT-OSS-120b, for instance, generated a gadget sequence that came close to the actual exploit used in the vulnerability. Even Kimi K2 demonstrated an understanding of lateral movement, figuring out how the attack could automatically spread from one compromised machine to others—a crucial detail that was not highlighted by Anthropic.

A woman with binary code lights projected on her face, symbolizing technology.

The Limits of Capability: Where Small Models Shine

The true test of AI prowess in cybersecurity is not just finding a bug, but engineering a functional exploit. The complexity of the challenge becomes apparent when comparing the success rates across different model architectures.

While Mythos successfully demonstrated its ability to split a large payload across 15 separate network requests to bypass size restrictions, the open models found alternative, workable paths, even if they didn't replicate the exact trick. The OpenBSD bug presented a different challenge, requiring a deep mathematical grasp of integer overflows and list states. Here, the performance varied wildly. GPT-OSS-120b was able to reconstruct the full, publicly described exploit chain in a single run, even proposing the actual patch as the fix.

In contrast, the same OpenBSD bug saw Qwen3 32B declare the code "robust to such scenarios," while Vidoc Security noted that Claude Opus 4.6 reproduced the vulnerability in three out of three runs, while GPT-5.4 failed every time. This disparity illustrates what researchers call "the jagged frontier"—a capability boundary that is highly uneven and model-specific.

The Fragmentation of AI Security Expertise

The findings dismantle the idea of a single, monolithic "best" model for cybersecurity. The performance metrics reveal that model effectiveness is acutely dependent on the specific type of vulnerability being analyzed.

For instance, the initial memory bug in FreeBSD was a common challenge, successfully tackled by multiple models. However, the OpenBSD bug requires specialized knowledge of low-level memory management and mathematical logic. The fact that different models exhibit vastly different success rates on the same code sample—sometimes succeeding, sometimes failing—underscores that AI security expertise is not a uniform commodity.

This suggests that the current state of the art is not defined by sheer parameter count or proprietary access, but by the model's ability to specialize. The open-source ecosystem, by forcing replication and comparison, is rapidly mapping out these specific, high-utility capabilities, proving that smaller, focused models can be highly effective when paired with targeted, expert prompting and rigorous testing.

Open Models Challenge Anthropic's Cybersecurity AI Mythos

Key Points

Overview

The Replication of High-Stakes Vulnerability Discovery

The Limits of Capability: Where Small Models Shine

The Fragmentation of AI Security Expertise

More stories

Anthropic discovers "functional emotions" in Claude that influence its behavior

GPT-5.4 Just Dropped: Is OpenAI's New Model the AI Powerhouse We've Been Waiting For?

Gemma 4 Brings Private Agentic AI to Smartphones