When AI becomes the cyber attacker: Mythos and what comes next

https://www.dataprotectionreport.com/2026/05/when-ai-becomes-the-cyber-attacker-mythos-and-what-comes-next/

Publish Date: 2026-05-21 19:48:00

Source Domain: www.dataprotectionreport.com

Anthropic’s April 7, 2026 announcement that it built a model too powerful for public consumption, Claude Mythos Preview (Mythos), marks a notable moment for the legal, compliance, and cybersecurity communities. It is no surprise that the US Department of the Treasury and the Federal Reserve convened an emergency meeting with major bank CEOs the day after this announcement and the IMF has called out that Mythos-like models pose serious financial stability risks.

Although access to Mythos itself is currently restricted to approximately 40 hand-picked organizations under an initiative called Project Glasswing, equivalent capability is estimated to emerge in the broader market, potentially in adversarial hands, within 6 to 24 months.

What Anthropic Disclosed

Mythos is a general-purpose large language model that was not designed as a cybersecurity tool. Its offensive capabilities were not purposefully engineered but rather emerged as a byproduct of general improvements in reasoning, code generation, and autonomous task execution. That distinction is essential as it implies that every frontier AI laboratory currently pursuing the same general improvements is potentially on a path to the same destination.

According to Anthropic’s own published technical analysis, Mythos can autonomously identify “zero-day vulnerabilities in every major operating system and every major web browser.” The model is then able to develop working exploits for those vulnerabilities without human assistance. In benchmark testing, Anthropic’s prior frontier model produced working exploits only twice in several hundred attempts on a standardized task, where Mythos produced 181 working exploits. In broader testing across thousands of open source software targets, Mythos distinguished itself from prior frontier models by achieving full system compromise on ten fully-patched targets.

Anthropic also disclosed three notable behavioral findings…

Source