How much does Claude Fable 5 cost?

Claude Fable 5 is priced at $10 per million input tokens and $50 per million output tokens — the premium tier. That's roughly double Opus 4.8's $5/$25 and over three times Sonnet 5's $3/$15. It uses the same tokenizer as Opus 4.8, so the cost difference between those two is purely the per-token price, with no hidden token-count inflation.

Is Claude Fable 5 better than Opus 4.8?

Fable 5 is the more capable model — it's the top of the ladder — and it pulls ahead most on the hardest long-horizon autonomous work and first-shot builds. But Opus 4.8 delivers state-of-the-art capability at half the price, so it's the better choice for the large majority of hard tasks. Default to Opus 4.8 (or Sonnet 5) and escalate to Fable 5 only when a specific task justifies the premium.

What is Claude Fable 5 best at?

Long-horizon autonomous agentic runs, first-shot implementations of well-specified systems, end-to-end enterprise deliverables like financial models and documents, vision on dense or degraded images, coordinating async sub-agents, and file-based memory across sessions. It's built to be pointed at a hard outcome and left to run largely unattended.

How is Claude Fable 5 different from Claude Mythos 5?

They're the same model with different access. Claude Mythos 5 (claude-mythos-5) is available only to participants in Anthropic's Project Glasswing and has identical capabilities, pricing, and API behavior. Claude Fable 5 (claude-fable-5) is the generally available version everyone else calls. If you're not in Project Glasswing, Fable 5 is your model.

Why does my Claude Fable 5 request return an error even though it looks valid?

Two common causes. First, thinking is always on: sending thinking type 'disabled' or a budget_tokens value returns a 400 — use the effort parameter instead. Second, Fable 5 requires 30-day data retention, so organizations configured for zero data retention get a 400 on every request. Also note that a safety refusal returns a successful HTTP 200 with stop_reason 'refusal', not an error — check stop_reason before reading the content.

Claude Fable 5: The Deep-Dive Guide — What It Does, What It Costs, and When It's Worth the Premium (2026)

Claude Fable 5 is the model Anthropic points to when the answer to “can an AI actually do this?” needs to be yes. It’s the most capable model they’ve released widely — built for the work that used to be a research demo: multi-hour autonomous runs, first-shot builds of well-specified systems, and end-to-end deliverables a person would bill days for. In the API it’s the model id claude-fable-5.

It’s also the most expensive Claude you can call, it behaves differently from every Opus-tier model before it, and for most of what you do day to day it is the wrong choice. This is the honest deep-dive: what Fable 5 genuinely does better, what it costs once you do the arithmetic, the API quirks that will trip you up, and — the part the launch hype skips — the large set of jobs where you should reach for something cheaper.

CLAUDE FABLE 5 AT A GLANCE

$10 / $50

per 1M tokens

input / output — the premium tier

token context

the default, not just the max

128K

max output

streaming required at that size

Always-on

extended thinking

you can't turn it off

TL;DR

Claude Fable 5 is Anthropic’s most capable widely released model — tuned for the hardest reasoning and long-horizon agentic work, not for chat or high-volume throughput.
It’s premium-priced at $10 / $50 per million tokens — roughly 2× Opus 4.8 and over 3× Sonnet 5. The capability ceiling is real; so is the bill.
It behaves differently from Opus-tier models. Thinking is always on (you can’t disable it), the raw chain of thought is never returned, and safety classifiers can decline a request with a refusal stop reason you have to handle in code.
It got broadly available the slow way — through an invitation-only preview and a restricted access program before general release, not a big-bang launch.
The play: keep Opus 4.8 or Sonnet 5 as your default, and escalate a specific workload to Fable 5 only when its ceiling — long-horizon autonomy, first-shot correctness on a genuinely hard task — is worth 2×+ the price.

What Is Claude Fable 5?

Claude Fable 5 is Anthropic’s flagship model for the most demanding reasoning and long-horizon agentic work — the most capable model they’ve released to the general public. It has a 1-million-token context window (which is both the maximum and the default), up to 128K output tokens per request, always-on extended thinking, and high-resolution vision. You call it in the API as claude-fable-5.

The one-sentence version: it’s the model you give your hardest, longest, least-supervised problem to. Where a mid-tier model shines on the tasks you’d assign a competent engineer, Fable 5 is built for the ones you’d assign a senior engineer a week to figure out — and it’s meant to run largely unattended while it does.

Three things define it in practice:

It’s built for autonomy, not turns. The headline is long-horizon execution: a single request on a hard task can run for many minutes while the model gathers context, builds, and verifies its own work. This is not a chat model with a bigger brain — it’s a model tuned to be pointed at an outcome and left alone.
It trades control for capability. Several knobs you’re used to are gone. Thinking is always on. Sampling parameters are rejected. There’s no assistant prefill. You steer it with prompting and an effort dial, not low-level parameters.
It’s the premium tier, and priced like it. At $10 / $50 per million tokens it sits above every Opus model. The interesting question was never “is it the most capable?” — it is. The question is “is this workload hard enough to justify it?”

💡 Key insight: Fable 5’s value isn’t spread evenly. On routine work it’s overkill you’re overpaying for. On the hardest long-horizon runs it does things no cheaper model reliably does. Knowing which bucket a task falls into is the whole skill.

How Claude Fable 5 Became Widely Available

Fable 5 didn’t arrive as a single splashy launch. The capability reached the public through a staged rollout that started tightly restricted and opened up over time — worth understanding, because the more exclusive tiers still exist alongside it.

STAGE 1 · INVITE-ONLY

Claude Mythos Preview

The capability first showed up as claude-mythos-preview — an invitation-only preview. You could not simply call it; access was gated to selected partners.
STAGE 2 · RESTRICTED PROGRAM

Claude Mythos 5 (Project Glasswing)

The preview was succeeded by claude-mythos-5 — same capabilities, pricing, and API surface, but available only to participants in Anthropic's Project Glasswing. Still restricted, just less so.
STAGE 3 · GENERAL AVAILABILITY

Claude Fable 5

claude-fable-5 is the widely released GA model — the same class of capability, available to everyone through the standard API. This is the one you reach for unless you're in Project Glasswing.

The practical takeaway: Fable 5 and Mythos 5 are the same model with different front doors. Mythos 5 is what Project Glasswing participants use; Fable 5 is the generally available equivalent everyone else calls. If you’ve read about “Mythos” and wondered whether you’re missing a more powerful model — you’re not. You just reach it under a different id.

What Claude Fable 5 Can Actually Do

Fable 5’s gains show up most on work above what previous models could reliably finish. Evaluate it on the tasks a mid-tier model already handles and you’ll wonder why you paid the premium. Point it at the hard stuff and the difference is obvious.

WHERE FABLE 5 EARNS THE PREMIUM

AUTONOMY

Long-horizon agentic runs

The flagship use case. Fable 5 sustains multi-hour, largely unattended runs — planning, building, and self-verifying — where the failure mode of cheaper models is losing the thread halfway through.

FIRST-SHOT BUILDS

Well-specified systems, in one pass

Give it a complete brief up front and it implements end-to-end: infra, backend, frontend, tests, CI. It's the model behind the same-day microservice teardown on this blog.

ENTERPRISE WORK

Financial models, spreadsheets, slides, docs

Genuine end-to-end knowledge-work deliverables — the kind that combine analysis, structure, and formatting into a finished artifact rather than a draft you rebuild.

VISION

Dense and degraded images

High-resolution vision, and it's trained to reach for tools — cropping, running code against an image — on flipped, blurry, or noisy inputs instead of guessing.

DELEGATION

Parallel, async sub-agents

Reliably runs and coordinates long-lived sub-agents that communicate asynchronously, so the orchestrator isn't blocked on the slowest one and context persists across subtasks.

MEMORY

File-based memory across sessions

It's markedly better at writing learnings to a memory file and using them later. Give it a place to take notes and a format, and it compounds context across a long project.

If your problem lives on that grid — an overnight build, a hard multi-file migration, an end-to-end analysis deliverable, a long autonomous agent loop where correctness matters more than the token bill — Fable 5 is the model that clears the bar cheaper models keep tripping on. If it doesn’t, keep reading, because you’re probably about to overpay.

The API Mechanics Nobody Warns You About

This is where Fable 5 stops being “a better model” and starts being a different one. If you migrate an Opus-tier integration by swapping the model string, several of these will bite you. Every one of them is a documented behavior, not a bug.

Thinking is always on — steer it with `effort`, not `budget_tokens`

Extended thinking runs on every request and cannot be disabled. Sending thinking: {type: "disabled"} returns a 400. Sending the old thinking: {type: "enabled", budget_tokens: N} also returns a 400 — the token-budget knob is gone. You control depth with the effort parameter instead.

from anthropic import Anthropic

client = Anthropic()

response = client.messages.create(
    model="claude-fable-5",
    max_tokens=16000,
    # No `thinking` field — it's always on. Control depth with effort:
    output_config={"effort": "high"},  # low | medium | high | xhigh | max
    messages=[{"role": "user", "content": "..."}],
)

effort runs from low through high, plus xhigh and max. Higher effort buys deeper reasoning and better self-verification; it also spends more tokens and takes longer. A useful counterintuitive fact: Fable 5 at low often beats a previous-generation model at max. Sweep the levels on your own workload — don’t reflexively pin it to max.

The raw chain of thought is never returned

You’ll get thinking blocks in the response, but their text is empty by default (display: "omitted"). Set display: "summarized" to get a readable summary of the reasoning — the raw chain of thought is never exposed on any setting. If you stream reasoning to users, the default looks like a long pause before output; opt into summaries explicitly.

Requests can be refused — handle it before reading content

Fable 5 runs safety classifiers (targeting research biology and most cybersecurity content) that can decline a request. A decline is a successful HTTP 200 with stop_reason: "refusal" — not an exception. Code that reads response.content[0] unconditionally will crash on a refused request. Always branch on stop_reason first, and opt into a server-side fallback so a refusal doesn’t just fail:

response = client.beta.messages.create(
    model="claude-fable-5",
    max_tokens=16000,
    betas=["server-side-fallback-2026-06-01"],
    fallbacks=[{"model": "claude-opus-4-8"}],  # re-served on the same call if Fable declines
    messages=[{"role": "user", "content": "..."}],
)

if response.stop_reason == "refusal":
    handle_refusal()          # pre-output refusals are empty and unbilled
else:
    print(response.content[0].text)

Benign, adjacent work (security tooling, life-sciences tasks) can trip a false positive, which is exactly why the fallback matters. A decline before any output isn’t billed at all; a mid-stream decline bills the partial output — discard it.

It requires 30-day data retention

Fable 5 is not available under zero data retention. If your org is configured for ZDR (or any retention below 30 days), every Fable 5 request returns a 400 — even a perfectly valid one. If a migration suddenly 400s with no obvious payload problem, check your retention setting before you debug the request body.

Plan for minutes-long turns and token budgets

Because thinking is always on and the model is tuned for long-horizon work, a single request on a hard task can run for many minutes. Two consequences: stream anything with a large max_tokens (128K output requires streaming to avoid HTTP timeouts), and consider a task budget to let the model pace itself across an agentic loop — it sees a running countdown and wraps up gracefully instead of getting cut off.

with client.beta.messages.stream(
    model="claude-fable-5",
    max_tokens=128000,
    output_config={"effort": "high", "task_budget": {"type": "tokens", "total": 64000}},
    betas=["task-budgets-2026-03-13"],
    messages=[...],
    tools=[...],
) as stream:
    response = stream.get_final_message()

The Real Cost Math

Here’s where the deep-dive earns its keep. Fable 5 is the premium tier, and the sticker is only half the story. First, the shape of the ladder — because the cost climbs a lot faster than the difficulty of most work does.

THE ESCALATION LADDER

Cost climbs a cliff. Difficulty rarely does.

Each meter is the output price per million tokens; the gold deepens with the capability tier. The step up to Fable 5 is a jump, not a nudge — which is exactly why it belongs at the top of the ladder, not the middle of your workload.

Haiku 4.5

Simple, high-volume, latency-sensitive

$1 / $5

Sonnet 5 DEFAULT

The sensible default for most work

$3 / $15

Opus 4.8

Escalate hard tasks here

$5 / $25

Fable 5 PREMIUM

Only the hardest long-horizon work

$10 / $50

Default around Sonnet 5, escalate hard tasks to Opus 4.8, and step up to Fable 5 only when you've measured that even Opus falls short.

Now the same ladder with the per-model guidance spelled out:

THE 2026 CLAUDE PRICE LADDER (PER 1M TOKENS)

Claude Fable 5

Top capability

The deal

$10 / $50

What you get

The most capable widely released model. Roughly 2x Opus 4.8 and over 3x Sonnet 5. Reserve for the hardest long-horizon work.

Claude Opus 4.8

Flagship default

The deal

$5 / $25

What you get

State-of-the-art agentic and knowledge work at half Fable's price. The right ceiling for the large majority of hard tasks.

Claude Sonnet 5

Best value

The deal

$3 / $15 ($2 / $10 intro)

What you get

Near-Opus coding and Opus-parity knowledge work at a fraction of the cost. The sensible default for most workloads.

Claude Haiku 4.5

Cheapest

The deal

$1 / $5

What you get

Latency-sensitive, simple, high-volume tasks where flagship intelligence is overkill.

Do the arithmetic on a real task. Say a long agentic run consumes 500K input tokens (large context, re-sent across turns) and 100K output tokens. On Fable 5 that’s $5.00 + $5.00 = $10.00. The identical run on Opus 4.8 is $2.50 + $2.50 = $5.00, and on Sonnet 5 (standard) about $1.50 + $1.50 = $3.00. Same task, 2×–3.3× the cost. Multiply by a fleet of agents and Fable’s premium stops being a rounding error and becomes a budget decision.

So the premium is honest and predictable relative to Opus. The question is never “is it inflated by tokenization?” — it’s “is this task hard enough that Fable’s higher ceiling saves me more than the 2× costs.”

Where Fable 5 Sits vs Opus 4.8 and Sonnet 5

This isn’t a “which is better” question — Fable 5 is the most capable, that’s what the top of the ladder means. It’s a “when is the extra capability worth 2× Opus and 3× Sonnet” question.

PROMPT

"Run an agent unattended for hours to build or migrate a genuinely hard, multi-step system where a wrong turn is expensive to unwind."

Currently viewing: Claude Fable 5

Reach for Fable 5 when the ceiling is the whole point. The hardest long-horizon autonomous runs, first-shot builds from a complete spec where you’d rather pay more than review a subtly-wrong result, end-to-end enterprise deliverables, and agent fleets whose correctness on hard tasks — not throughput — is the constraint. If the task is one you’d give a senior engineer days for and want done largely unattended, this is the tier.

The heuristic I use: default to Sonnet 5, escalate hard tasks to Opus 4.8, and reserve Fable 5 for the handful of jobs where you’ve measured that even Opus leaves value on the table. Don’t pay the Fable premium as an insurance policy across the board — pay it where you’ve watched a cheaper model fall short on that specific task. This is the same “measure, don’t assume” discipline that separates teams who ship real production work with AI agents from teams who burn budget guessing.

The Honest Ledger

WHAT YOU'RE ACTUALLY BUYING

The real trade-off when you put a workload on Fable 5 instead of Opus 4.8 or Sonnet 5.

The pros

The highest capability ceiling of any widely released Claude model — built for long-horizon autonomy and first-shot correctness on hard tasks
Sustains multi-hour unattended runs and coordinates async sub-agents where cheaper models lose the thread
Strong end-to-end knowledge-work deliverables — analysis, spreadsheets, slides, docs — not just drafts
High-resolution vision with tool-assisted handling of degraded images
Same tokenizer as Opus 4.8, so the cost premium is predictable — no hidden token-count inflation between the two
File-based memory that compounds context across a long project

The cons

Premium pricing — $10/$50 is ~2x Opus 4.8 and over 3x Sonnet 5, and it adds up fast across an agent fleet
Overkill for routine work: on tasks a mid-tier model handles, you're paying for a ceiling you never touch
Turns can run for minutes — bad fit for latency-sensitive or interactive UX without careful streaming and progress design
Less low-level control: thinking can't be disabled, sampling parameters are rejected, and there's no assistant prefill
Safety classifiers can refuse requests (incl. false positives on benign security/bio-adjacent work) — you must handle the refusal path
Requires 30-day data retention — unavailable to zero-data-retention orgs

Verdict: an exceptional model for a narrow, high-value slice of work. If you can't name the specific hard task that needs it, you don't need it yet — default to Sonnet 5 or Opus 4.8.

When You Should NOT Reach for Fable 5

The most useful thing I can tell you about the most capable model is when to skip it. Fable 5’s premium is only worth it at the top of the difficulty curve. Everywhere else, you’re lighting money on fire for capability you won’t use.

Don’t use Fable 5 for:

Chat, assistants, and RAG backends. These want speed and cost efficiency. Sonnet 5 is the right default; Haiku 4.5 for the simplest, highest-volume paths.
Interactive, latency-sensitive coding. Minutes-long turns are a feature for overnight autonomy and a liability when a human is waiting. Use Opus 4.8 or Sonnet 5 in the loop.
High-volume agent fleets where cost is the constraint. At 2×–3× the token price, Fable turns an affordable experiment into a line item you’ll have to defend. Run the fleet on Sonnet 5 and escalate individual hard tasks.
Anything a mid-tier model already passes your evals on. If Sonnet 5 or Opus 4.8 clears the bar, the extra capability is invisible and the extra cost is not.

How to Actually Get Value From Fable 5

If a task does clear the bar, the way you prompt Fable 5 matters more than with any prior model. It’s more autonomous and more literal, so a few habits change.

GET THE MOST OUT OF FABLE 5

Track progress as you work through the list

0/7 done

Give the full task specification up front in one well-specified turn — Fable rewards a clear goal far more than progressive, ambiguous back-and-forth critical
Run long-horizon work at high or xhigh effort, and sweep low/medium for routine tasks to control cost critical
Handle the refusal stop reason and opt into a server-side fallback to claude-opus-4-8 by default high
Let it delegate — encourage async sub-agents for independent subtasks instead of suppressing delegation high
Give it a memory file (and a format) so it compounds learnings across a long project medium
Ask it to ground progress claims against actual tool results — it nearly eliminates fabricated status on long runs medium
Strip over-prescriptive, step-by-step scaffolding from prompts written for older models — it reduces Fable's output quality medium

That last one is counterintuitive and worth repeating: prompts and skills written for previous models are often too prescriptive for Fable 5 and actively lower its output quality. State the goal and the constraints, then get out of the way. If you’ve internalized a house style for writing a good CLAUDE.md or running Claude Code in auto mode, Fable is the model that most rewards trusting it with the what and why instead of dictating every step.

FAQ

Questions readers usually have

The questions people keep asking about Claude Fable 5.

The Verdict

Claude Fable 5 is the most capable model Anthropic has put in front of the general public, and it earns that title on exactly the work it was built for: long, unattended, genuinely hard problems where the ceiling is the point. The staged rollout — invitation-only preview, to a restricted program, to general availability — was really a story about access widening, not capability changing hands.

But “most capable” and “the one you should use” are different sentences. At $10 / $50, Fable 5 is a specialist tool. Put it on routine work and you’re paying flagship rates for a ceiling you’ll never touch; put it on the hardest autonomous runs and it does things cheaper models don’t. Default to Sonnet 5, escalate hard tasks to Opus 4.8, and spend Fable 5 tokens only where you’ve measured that even Opus leaves capability on the table.

Want to see what that top-of-the-ladder capability looks like turned loose on a real project? Read how Fable 5 built a full streaming microservice in a day next — the case study behind the capabilities on this page.

Sources

Written for umesh-malik.com — no-fluff technical writing on AI, Web Dev, and Engineering.

Long-horizon agentic runs

Well-specified systems, in one pass

Financial models, spreadsheets, slides, docs

Dense and degraded images

Parallel, async sub-agents

File-based memory across sessions

Claude Fable 5

Claude Opus 4.8

Claude Sonnet 5

Claude Haiku 4.5

+ The pros

− The cons

What is Claude Fable 5?

How much does Claude Fable 5 cost?

Is Claude Fable 5 better than Opus 4.8?

What is Claude Fable 5 best at?

How is Claude Fable 5 different from Claude Mythos 5?

Why does my Claude Fable 5 request return an error even though it looks valid?

Related Articles

Claude Sonnet 5: The Honest Guide — Pros, Cons, Use Cases, and What It Actually Costs (2026)

ChatGPT "Adult Mode": What OpenAI's Delayed Feature Means for U.S. Adults, Parents, and Privacy

ChatGPT Now Teaches Math and Science With Interactive Visuals — What You Need to Know

Explore Topics

The pros

The cons