Anthropic restores Fable 5 and proposes a jailbreak severity framework

Anthropic says US export controls on Claude Fable 5 and Claude Mythos 5 have been lifted, clearing the way for Fable 5 to return globally on July 1. The model will be available on the Claude Platform, Claude.ai, Claude Code, and Claude Cowork, with cloud surfaces such as AWS, Google Cloud, and Microsoft Foundry to be re-enabled as quickly as possible.

The access change resolves the most visible problem from the June 12 suspension. But the more important part of Anthropic’s post is the framework it wants the industry to build around AI jailbreaks.

Anthropic says it is working with Amazon, Microsoft, Google, and other Glasswing partners on a common way to score jailbreak severity. The proposed criteria are capability gain, breadth of capability gain, ease of weaponization, and discoverability. That is a shift from treating jailbreaks as a binary pass/fail event to treating them more like security vulnerabilities with severity classes.

The restoration is uneven by design

Fable 5 and Mythos 5 are not returning in the same way.

Anthropic says Fable 5 will be available globally. It will be included for up to 50% of weekly usage limits through July 7 for Pro, Max, Team, and select Enterprise plans, then will be available through usage credits. Standard Enterprise seats do not get the included allowance unless they are premium seats; Anthropic says they need usage credits enabled for Fable 5 to work.

Mythos 5 remains more restricted. Anthropic says access has been restored for a set of US organizations after US government approval on June 26, and that the company is coordinating with the government to expand access to more domestic and international Glasswing partners.

That distinction is the story. Anthropic is presenting Fable as the general model with heavier safeguards and Mythos as the more sensitive cybersecurity-capable system. Restoring access does not mean the same governance model applies to both.

The classifier fix has a cost

The incident began after a report from Amazon researchers found a way to bypass Fable 5’s safeguards so the model could identify software vulnerabilities. Anthropic says one case produced code showing how the vulnerability could be exploited.

Anthropic’s post says its testing found the reported behavior did not expose unique Mythos-level cyber capabilities. It also says many less capable models could identify the same vulnerabilities, and that several models could produce the same demonstration. That is Anthropic’s own analysis, but it matters because it frames the issue as a borderline safeguard case rather than a unique release of offensive capability.

The immediate mitigation is a new safety classifier. Anthropic says the classifier blocks the specific technique described in the Amazon report in over 99% of cases. The caveat is practical: the company also says the new classifier will flag benign routine coding and debugging requests more often.

That is the trade-off users will feel. A stronger cyber safety margin can protect against misuse, but it can also create false positives for legitimate security and debugging work. If Fable 5 is going back into Claude Code and enterprise workflows, Anthropic will have to tune that boundary without making ordinary developers feel like routine analysis is randomly unavailable.

Jailbreak scoring is the bigger precedent

The proposed framework is more consequential than one classifier patch. Anthropic says the industry lacks a consensus way to describe jailbreak severity in objective terms. That creates uncertainty for labs, governments, and customers when a new technique appears.

The four criteria are sensible because they separate different risks. A jailbreak that unlocks a capability already available through common tools is not the same as one that gives non-experts access to expert-level offensive behavior. A narrow technique is not the same as a universal one. A finding that requires many retries and specialist prompting is not the same as a one-shot method spreading online.

That kind of scoring would not remove judgment. It would make the judgment more visible. Labs could triage findings consistently. Governments could avoid overreacting to low-severity reports or underreacting to serious ones. Customers could ask better questions about what a reported jailbreak actually enabled.

The analogy is not perfect, but the security world already has a model for this: vulnerability severity is not only about whether a bug exists. It is about exploitability, impact, preconditions, and real-world exposure. AI jailbreaks need a similar vocabulary if advanced models are going to be reviewed, patched, and released under pressure.

Risk note

Ad hoc review makes model access unpredictable

Anthropic says it will deepen collaboration with US government partners on pre-release testing, rapid information sharing, dedicated joint research, and a common voluntary security bar. The company also says government involvement in AI releases needs a durable and transparent process applied equally across frontier model developers.

That last point is the policy tension. If government review is ad hoc, model access can become unpredictable for customers and uneven for competitors. If review is structured, labs may get a clearer release path but also a higher compliance burden.

Fable 5’s return is therefore not a clean reset. It is a sign of the next operating model for powerful AI systems: access, safeguards, external testing, government scrutiny, and partner-specific permissions all moving together.

For developers and enterprises, the practical next step is to test Fable 5 again with the new classifier in place. The question is not only whether the model performs well. It is whether the restored model can handle real coding and security workflows without turning safety margin into daily friction.

Anthropic restores Fable 5 and proposes a jailbreak severity framework

The restoration is uneven by design

The classifier fix has a cost

Jailbreak scoring is the bigger precedent

Sources