Hands-on with Claude Fable 5: Anthropic's Mythos model
Table of Contents
Anthropic shipped Claude Fable 5 on June 9, and I’ve spent the days since pointing it at real code instead of reading the launch posts. Fable 5 is the company’s most powerful generally available model — a new tier that sits above Opus — and it’s the first public Claude Fable 5 release of what Anthropic calls a Mythos-class model. This is my hands-on account of what it actually does differently, the API quirks you’ll hit on day one, and whether the price is worth it.
What Fable 5 actually is — and how Mythos fits in
The naming trips people up, so let’s be precise. Mythos-class models are a tier of Claude models that sit above the Opus class in capability. There are two models in this announcement, not one. The first, Claude Mythos Preview, was released in April through Project Glasswing. That is followed by Claude Fable 5 and Claude Mythos 5.
The split matters. The safeguards are what distinguish the two models, Fable and Mythos, and are why Anthropic gave them different names. Mythos 5 itself is not generally available — Anthropic is deploying Mythos 5 to organizations that have already been approved to access the advanced model. Fable 5 is effectively Mythos made safe for the public. Anthropic said Fable 5’s broad release is possible because of new safeguards that block responses in specific high-risk areas, including cybersecurity and biology.
In practice, those safeguards route a small share of sessions — under 5% in my experience — away from Fable and over to Opus. In high-risk areas like cybersecurity, biology, chemistry, and distillation, the model blocks responses and falls back to Claude Opus 4.8. If a user asks a high-risk question — like how to make ricin, a toxin — the model will block its response and fall back to Claude Opus 4.8 to deliver a safe answer. For normal engineering work you’ll rarely notice it.
Why release a watered-down Mythos at all? Because the raw capability is genuinely high-risk. The frontier cybersecurity and research biology capabilities of Mythos-class models mean they pose a substantial risk of uplift to malicious actors.
How it stacks up against Opus 4.8, Sonnet, and Haiku
Fable 5 is not a free upgrade. It costs $10 / $50 per million tokens (input/output) — exactly double Opus 4.8 at $5 / $25. Both share the same 1M-token context window and 128K max output, so you’re paying purely for reasoning quality, not headroom.
Is the quality jump real? Anthropic’s own numbers say yes, at least on paper. On some benchmarks, Fable 5 scored more than 10% higher than Claude Opus 4.8. The more interesting claim is autonomy. Fable 5 and Mythos 5 can work autonomously for longer than any previous Claude models. Anthropic leans on a Stripe anecdote here: during early testing, Stripe reported that Fable 5 compressed months of engineering into days — in a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a team over two months by hand.
That said, for most day-to-day work the math doesn’t favor Fable. Sonnet 4.6 and Haiku 4.5 still win on cost-per-task for the bulk of what developers do — boilerplate, refactors, test generation, chat features. Reach for Fable only when the problem is genuinely hard and long-horizon: large-codebase audits, multi-step migrations, or analysis where a missed edge case costs real money. If you haven’t tuned your prompts for this kind of model yet, my notes on prompt engineering fundamentals carry over directly.
Day-one API notes: the request surface and one new gotcha
The good news: if you’ve already integrated Opus 4.7 or 4.8, Fable 5 is a drop-in. The model ID is claude-fable-5, and the request surface is the same — adaptive thinking only, with no temperature, top_p, top_k, and no assistant prefills.
There is one new trap. On Fable 5, sending an explicit thinking: {type: "disabled"} returns a 400. You can’t turn thinking off — so omit the parameter entirely instead of trying to disable it.
from anthropic import Anthropic
client = Anthropic()
# Correct: no thinking param at all
resp = client.messages.create(
model="claude-fable-5",
max_tokens=8000,
messages=[{"role": "user", "content": "Audit this module for race conditions."}],
)
# Wrong: this returns a 400 on Fable 5
# thinking={"type": "disabled"}
If you’re wiring this into tools or agents, the rest of the surface behaves like Opus — see my Claude tool use function-calling guide and the getting started with the Claude API in Python walkthrough for the boilerplate.
One scheduling note worth acting on: access is time-boxed. Through June 22, Fable 5 will be included in Pro, Max, Team, and seat-based Enterprise plans at no extra cost. On June 23, Anthropic will pull Fable 5 from those plans, requiring usage credits going forward, with plans to restore it as a standard subscription feature as soon as possible. If you want to evaluate it without burning credits, that window closes fast.
The real test: I let Fable 5 audit my trading tools
Benchmarks are noise to me until a model finds something I missed. So I ran Fable 5 through Claude Code against a set of personal trading scripts — small Python tools that size positions and place orders against a broker API. I’d had Opus-tier models audit the same code before. They came back with style notes and a couple of try/except suggestions. Fable 5 came back with bugs that could have lost real money.
What stood out was how it reasoned. Instead of linting file by file, it built a mental model of the order lifecycle first — where price comes in, where a decision is made, where an order is sent — and then hunted for places where those steps could interleave badly. It flagged a race condition between price updates and order placement: a fast-moving quote could update the reference price after my size calculation but before the order hit the wire, so the order would go out against stale numbers.
It also caught edge cases earlier audits glossed over:
- Position sizing: a rounding path that, at small account balances, could compute a size of zero and silently skip a trade, or round up past my risk cap.
- Order handling: an unvalidated branch where a partially-filled order’s remaining quantity was reused without re-checking buying power — a path that could have placed an oversized follow-up order.
- Input validation: a code path where a malformed API response could flow straight into an order request without a sanity check, i.e. a bad trade with real money.
None of these were exotic. They were the kind of long-tail bugs you only find when something reads the whole path and reasons about ordering and state, not just syntax. That lines up with the autonomy claims — the value isn’t raw IQ, it’s that it holds the entire control flow in its head long enough to spot where two correct-looking pieces combine into a wrong one.
If you want this kind of review on a schedule rather than ad hoc, I’ve written about wiring automated AI code review into GitHub pull requests — pointing a Fable-tier model at a focused diff is a reasonable upgrade for high-stakes repos.
Verdict: when 2x the price is worth it
After a week, here’s where I land:
- Use Fable 5 when a mistake is expensive. Audits of money-touching code, large migrations, and long-horizon refactors are exactly where the extra reasoning earns back the $10/$50 cost. It found loopholes Opus-tier models missed in the same files.
- Stick with Opus 4.8, Sonnet 4.6, or Haiku 4.5 for everyday work. For most coding, chat, and content tasks, doubling the token cost buys you nothing you’ll notice. The cheaper tiers are the right default.
- Try it before June 23. It’s free on Pro, Max, Team, and Enterprise plans only through June 22 — after that you’re paying credits, so this is the cheap window to find out if your hardest problems actually need it.
I went in sceptical of the “most powerful model” framing. On general work, I still am. But on a real audit where I had ground truth, Fable 5 earned its keep — and that’s the only benchmark I trust.