Anthropic Releases Claude Fable 5, Its Most Powerful AI Yet, With Cyber Safeguards

Anthropic Releases Claude Fable 5, Its Most Powerful AI Yet, With Cyber Safeguards

Anthropic Releases Claude Fable 5, Its Most Powerful AI Yet, With Cyber Safeguards

https://thehackernews.com/2026/06/anthropic-releases-claude-fable-5-its.html

Publish Date: 2026-06-10 03:37:00

Source Domain: thehackernews.com

On June 9, Anthropic released Claude Fable 5, the most capable model it has ever made, generally available. It also did something unusual: it shipped one model as two products, split not by capability but by a layer of safety classifiers.

Fable 5 goes to the public. Its twin, Claude Mythos 5, the same underlying model with the cyber safeguards lifted, stays locked to a vetted group of cyber defenders and critical infrastructure operators.

Anthropic calls Mythos 5 the strongest cybersecurity model in the world.

The practical difference is this: Fable 5 routes flagged cyber, biology, chemistry, and distillation requests to the weaker Claude Opus 4.8, while Mythos 5 keeps the cyber capabilities available for vetted users. Both models cost $10 per million input tokens and $50 per million output tokens, less than half the price of the earlier Mythos Preview, and Fable 5 is available through the Claude API now.

It is included on Pro, Max, Team, and seat-based Enterprise plans at no extra cost through June 22, then moves to usage credits.

How Fable 5’s cyber classifiers work

The split exists because Mythos-class models find and exploit software vulnerabilities well enough that, in Anthropic’s framing, handing that capability to the general public without controls would give attackers serious uplift.

The mechanism is a set of classifiers: separate AI systems that watch for misuse and jailbreak attempts. When a request trips one, Fable 5 does not refuse. The response is handed to Opus 4.8, and the user is told the handoff happened. Of the flagged categories, distillation is the odd one out: it means extracting a model’s capabilities to train a competing model, which Anthropic blocks to stop near-frontier abilities leaking out without safeguards attached.

The cybersecurity classifier is the broad one. Anthropic designed it to block not just exploit development but offensive cyber tasks in general: reconnaissance, discovery, lateral movement, the agentic steps…

Source