Anthropic Releases Claude Fable 5, Its Most Powerful AI Yet, With Cyber Safeguards
Anthropic Releases Claude Fable 5, Its Most Powerful AI Yet, With Cyber Safeguards
https://thehackernews.com/2026/06/anthropic-releases-claude-fable-5-its.html
Publish Date: 2026-06-10 03:37:00
Source Domain: thehackernews.com
On June 9, Anthropic released Claude Fable 5, the most capable model it has ever made, generally available. It also did something unusual: it shipped one model as two products, split not by capability but by a layer of safety classifiers.
Fable 5 goes to the public. Its twin, Claude Mythos 5, the same underlying model with the cyber safeguards lifted, stays locked to a vetted group of cyber defenders and critical infrastructure operators.
Anthropic calls Mythos 5 the strongest cybersecurity model in the world.
The practical difference is this: Fable 5 routes flagged cyber, biology, chemistry, and distillation requests to the weaker Claude Opus 4.8, while Mythos 5 keeps the cyber capabilities available for vetted users. Both models cost $10 per million input tokens and $50 per million output tokens, less than half the price of the earlier Mythos Preview, and Fable 5 is available through the Claude API now.
It is included on Pro, Max, Team, and seat-based Enterprise plans at no extra cost through June 22, then moves to usage credits.
How Fable 5’s cyber classifiers work
The split exists because Mythos-class models find and exploit software vulnerabilities well enough that, in Anthropic’s framing, handing that capability to the general public without controls would give attackers serious uplift.
The mechanism is a set of classifiers: separate AI systems that watch for misuse and jailbreak attempts. When a request trips one, Fable 5 does not refuse. The response is handed to Opus 4.8, and the user is told the handoff happened. Of the flagged categories, distillation is the odd one out: it means extracting a model’s capabilities to train a competing model, which Anthropic blocks to stop near-frontier abilities leaking out without safeguards attached.
The cybersecurity classifier is the broad one. Anthropic designed it to block not just exploit development but offensive cyber tasks in general: reconnaissance, discovery, lateral movement, the agentic steps…