In April 2026, Anthropic announced that Claude Mythos Preview — its most capable AI model to date — would not be made publicly available. The model autonomously discovered thousands of previously unknown software vulnerabilities, including flaws decades old, and converted many of them into working exploits without human direction. Rather than shelving it, Anthropic channeled it into a restricted defensive-security consortium called Project Glasswing.
What Is Claude Mythos?
If you’ve been following the Claude model line, you know the cadence: Haiku for speed, Sonnet for balance, Opus for heavy reasoning. Mythos doesn’t slot neatly into that lineup. It’s described by Anthropic as a general-purpose frontier model whose capabilities in computer security “emerged as a downstream consequence of general improvements in code, reasoning, and autonomy” rather than from deliberate security-specific training. That framing matters — it means the vulnerability-finding isn’t a special mode or plugin. It’s just what a sufficiently capable general reasoner does when pointed at code.
Specific vulnerability findings during pre-release testing
- A 27-year-old flaw in OpenBSD’s TCP SACK implementation enabling remote denial-of-service attacks.
- A 16-year-old vulnerability in the FFmpeg H.264 codec.
- CVE-2026-4747, a FreeBSD NFS remote code execution bug achieving unauthenticated root access, developed autonomously in a multi-hour agentic session.
- Multiple Linux kernel privilege escalation chains requiring the model to find and exploit several vulnerabilities in sequence.
- Thousands of zero-day vulnerabilities across every major operating system and browser
Why Anthropic Says It’s Too Dangerous
Anthropic’s Responsible Scaling Policy (RSP) — now in its third revision — creates a tiered framework of AI Safety Levels (ASL). ASL-2 covers today’s frontier models. ASL-3 is triggered when a model provides “meaningful uplift” to actors seeking to conduct large-scale cyberattacks or to create weapons capable of mass casualties. Mythos Preview crossed into ASL-3 territory on the cybersecurity dimension, according to Anthropic’s public risk documentation.
The company makes a pointed distinction between capability and uplift. Lots of models can help an experienced security researcher write a proof-of-concept. What Mythos does differently is lower the skill floor dramatically – it can take someone with modest technical background and give them the capability to develop a working exploit chain against hardened targets. The concern is that if the model reached actors targeting “systemically important” financial networks or critical infrastructure, the harm potential is asymmetric: a single actor could cause damage that would previously have required a nation-state-level team
| Metric | Opus 4.6 | Mythos Preview |
|---|---|---|
| SWE-bench Verified | ~66% (est.) | 93.9% |
| CyberGym Vulnerability Reproduction | 66.6% | 83.1% |
| Firefox exploit attempts (working) | ~2 / several hundred | 181 / several hundred |
| ASL Classification | ASL-2 | ASL-3 (cybersecurity) |
| Public availability | Generally available | Restricted (Glasswing only) |
Project Glasswing: The Limited-Release Compromise
Glasswing is Anthropic’s answer to a hard question: if you can’t release it, can you still use it for good? The initiative launched April 7, 2026, with twelve founding partners: Amazon Web Services, Anthropic itself, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Beyond those twelve, over 40 additional organizations received extended access.
Glasswing financials: Anthropic committed up to $100M in Mythos usage credits for participants, plus $2.5M to Alpha-Omega and OpenSSF, and $1.5M to the Apache Software Foundation. API pricing after the research preview: $25/$125 per million input/output tokens
What This Means Going Forward
The Mythos situation is, in some ways, the scenario AI governance researchers have been describing in white papers for years: a single frontier model crosses a capability threshold significant enough to warrant controlled deployment, and the lab has to improvise a governance structure in real time. A few things seem clear from watching it unfold.

