What is Claude Fable 5 auto mode?

Auto mode is Claude Code running the full development loop autonomously with Claude Fable 5 — planning, writing files, running commands, deploying infrastructure, and continuing until done — pausing only for decisions that genuinely need a human, like product trade-offs or cost ceilings. Fable 5 is Anthropic's Mythos-class model tier above Opus.

Did the AI really create all the AWS resources itself?

Yes — but through Terraform, not console clicks. Fable wrote infrastructure-as-code for S3, MediaConvert, CloudFront, IVS, Lambda, API Gateway, DynamoDB, EventBridge, IAM, and the monitoring stack, and applied it through a gated CI/CD pipeline using GitHub OIDC. Everything is reviewable, versioned, and reproducible.

How does the realtime streaming part work?

Real-time streams run on AWS IVS: broadcasters push RTMPS to a managed ingest endpoint, and viewers get an HLS-compatible playback URL with 2–5 second latency that plays in the same hls.js-based player as on-demand content. Fable chose IVS over MediaLive in the HLD to avoid unnecessary cost and operational overhead for v1.

How were existing raw videos and audio files migrated?

Fable wrote idempotent migration scripts: an inventory stage that fingerprints every legacy file into a DynamoDB state table, a queue stage that submits MediaConvert jobs under a concurrency and cost cap, a verification stage that checks duration and renditions per asset, and a dual-read cutover so anything unverified kept serving from the legacy path until the failure list hit zero.

Is it safe to let an AI agent manage security?

Safe-ish, with review. Fable applied least-privilege IAM per function, private buckets behind CloudFront Origin Access Control, signed cookies with short TTLs, KMS encryption, and OIDC instead of stored AWS keys — but I still reviewed every IAM policy and ran an independent code-review pass. The agent raises the security floor; the human still owns the sign-off.

How much did I actually have to do?

A one-page brief, 17 design decisions (protocol choice, IVS vs MediaLive, signed cookies vs URLs, Lambda vs Fargate, cutover risk tolerance), and roughly two hours of reviewing documents, Terraform plans, IAM policies, and the final diff. No hand-written application code.

How I Built a Full Audio/Video Streaming Microservice in One Day with Claude Fable 5 Auto Mode

Last week I did something I would have called irresponsible a year ago: I handed an entire production microservice — a full audio/video streaming service with real-time delivery, AWS infrastructure, security, CI/CD, and data migrations — to Claude Fable 5 running in auto mode, and I shipped it the same day.

Not a prototype. Not a demo repo with a TODO: auth comment. A service that today sits in front of every video and audio file my other services used to serve as raw, full-size downloads — now delivered as adaptive HLS streams with signed playback, sub-second startup, and a live-streaming path.

My total contribution was a written brief and 17 design decisions. Fable wrote the PRD, the TRD, the high-level and low-level designs, the Terraform, the backend, the frontend player, the GitHub Actions pipelines, the migration scripts that transcoded my existing media library, and the runbooks my other services now use to integrate. If you searched for how to build a microservice with an AI agent, or you’re wondering whether Claude Fable 5 auto mode is actually different from babysitting an autocomplete — this is the full, honest teardown.

THE DAY IN NUMBERS

~9 hrs

Wall clock

brief to production

Human inputs

all design decisions

4 docs

Before any code

PRD, TRD, HLD, LLD

100%

Infra as code

Terraform, zero console clicks

TL;DR

Claude Fable 5 in auto mode built a complete HLS audio/video streaming microservice in under a day — AWS resources, security hardening, backend, frontend player, CI/CD, and integration runbooks included.
It wrote PRD → TRD → HLD → LLD before writing a single line of code, and paused only to ask me real design questions: HLS vs DASH, signed cookies vs signed URLs, IVS vs MediaLive for the realtime path.
The architecture it chose is boring in the best way: S3 + MediaConvert + CloudFront signed cookies for on-demand media, AWS IVS for real-time streams, a small Lambda control plane, all provisioned by Terraform.
It also wrote idempotent migration scripts that transcoded my entire back catalog of raw videos and audio files — with a DynamoDB state table, concurrency caps, and verification.
My job changed from typing code to making decisions and reviewing diffs. That’s the actual story of agentic coding in 2026: the bottleneck moved from implementation to judgment.

What “Auto Mode” Means (and What It Doesn’t)

Claude Fable 5 is Anthropic’s newest model tier — the Mythos-class model that sits above Opus — and inside Claude Code, “auto mode” means the agent runs the full loop autonomously: it plans, edits files, runs commands, executes tests, deploys infrastructure, and keeps going until the task is done or it genuinely needs a human decision. You’re not approving every tool call. You’re not pasting code between windows. You set the destination and the constraints; the agent drives.

The crucial nuance: auto mode is not “no human input.” It’s minimal, high-leverage human input. Across nine hours, Fable stopped me 17 times — and every single stop was a question only I could answer: a product trade-off, a cost ceiling, a security posture choice. It never once asked me “how do I configure CloudFront?” It asked me “do you want playback URLs to be shareable, or locked to the session?” Those are very different questions, and the second one is the one that actually deserves my time.

💡 Key insight: The quality bar of an agentic build is set by the quality of the questions the agent asks you. Fable 5 asks product questions, not syntax questions — that’s the generational difference.

The Starting Point: Raw Files Pretending to Be Streaming

Here’s the embarrassing “before” picture, because every good case study needs one.

I run several services that handle user-facing media — course recordings, podcast-style audio, screen captures. The original implementation was the one every team ships first: media files uploaded to S3, served back through the app as raw, full-size files. A GET /files/{id} endpoint, a presigned URL, and an HTML <video> tag pointed at a 400 MB MP4.

It worked, in the way a tent works as a house. Here’s the same media library, before and after the one-day build:

RAW FILE SERVING VS A REAL STREAMING SERVICE

BEFORE — RAW FILES

A presigned URL and a prayer

Seeking was brutal — jumping to minute 40 meant range-requesting through a monolithic file, and mobile networks often re-buffered from scratch
One bitrate for everyone — hotel Wi-Fi got the same 1080p file as fiber, with no audio-only fallback
Bandwidth scaled with file size, not watch time — a 2-minute viewer still pulled half the file
Zero real-time story — live sessions ran on a third-party embed I didn't control

AFTER — HLS MICROSERVICE

Adaptive, signed, instant

Instant seek — short HLS segments mean jumping anywhere starts playback in under a second
Adaptive bitrate ladder — 1080p to audio-only, picked per viewer per second by the player
CDN-served segments, billed by watch time — viewers download only what they actually watch
Realtime built in — AWS IVS streams play in the exact same player as on-demand content

Same S3 library, same users — the delivery model is the only thing that changed.

The fix is well understood: transcode to HLS (HTTP Live Streaming) with an adaptive bitrate ladder, serve segments from a CDN, sign playback, and use a managed service for live. The reason I hadn’t done it: done properly, it’s a solid two-to-three week project across infra, backend, frontend, and a scary migration of existing content. That estimate is what Fable 5 deleted.

The One-Day Timeline

Timeline showing Claude Fable 5's one-day build: morning spent on PRD, TRD, HLD and LLD documents, midday on Terraform infrastructure and backend code, afternoon on frontend, CI/CD and migration scripts, evening on integration runbooks and production cutover

Here’s how the day actually broke down. Times are approximate; the shape is exact.

BRIEF TO PRODUCTION IN ~9 HOURS

The striking part is the ratio: roughly a quarter of the day went into documents and decisions before any code existed — and that front-loading is why the rest of the day didn't derail.

08:30 — 09:00

The brief

I wrote roughly a page: what the service must do (VOD streaming, realtime streams, drop-in integration for existing services), constraints (AWS, Terraform, no public buckets), and what done looks like.
09:00 — 11:00

PRD, TRD, HLD, LLD

Fable produced all four documents, pausing for design decisions: protocol, live-streaming service, signing strategy, compute model. I answered questions and red-penned the docs. Zero code yet.
11:00 — 13:30

Infrastructure + backend

Terraform for every AWS resource — buckets, MediaConvert, CloudFront, IVS, DynamoDB, IAM — plus the Lambda control plane: upload, transcode orchestration, playback sessions.
13:30 — 15:30

Frontend + CI/CD

A typed player package wrapping hls.js with adaptive quality, audio-only mode, and live support. GitHub Actions with OIDC — no long-lived AWS keys — and a gated terraform plan/apply flow.
15:30 — 17:00

Migration of the back catalog

Idempotent scripts that walked the legacy bucket, queued MediaConvert jobs with a concurrency cap, tracked state in DynamoDB, and verified every output against the source.
17:00 — 18:00

Integration + runbooks

Fable patched my other services to request playback sessions instead of raw URLs, kept a dual-read fallback for safety, and wrote the runbooks any future service needs to onboard.

Docs Before Code: The Part Everyone Skips

This is the section I most want you to steal, because it’s the highest-leverage behavior in the whole workflow — and it costs nothing.

Before touching code, Fable wrote four documents into the repo, in order, and made me approve each one:

PRD (Product Requirements Document) — what the service does and for whom: VOD playback, live streams, integration contract for sibling services, explicit non-goals (no DRM in v1, no user-generated live streams).
TRD (Technical Requirements Document) — the measurable bar: time-to-first-frame under 1 second on broadband, live glass-to-glass latency under 5 seconds, playback URLs unusable after expiry, migration must be resumable and verifiable.
HLD (High-Level Design) — the architecture diagram, the AWS services chosen and the ones rejected (with reasons), data flow for upload, playback, live, and migration.
LLD (Low-Level Design) — DynamoDB key design, every API route with request/response shapes, IAM policy boundaries per function, error taxonomy, the exact MediaConvert ladder.

If you’ve read my piece on spec-driven development with AI agents, you know I’m already convinced specs are the steering wheel for agentic coding. What Fable 5 adds is that the agent now writes the spec itself and interrogates you against it. The TRD review is where I caught the one thing I’d have regretted: the first draft proposed signed URLs per segment. I pushed back — per-segment signing breaks CDN cache efficiency — and Fable switched the design to CloudFront signed cookies scoped to a playback session, then updated the LLD and the threat notes to match, unprompted.

💡 Key insight: Reviewing a 2-page TRD takes 10 minutes and catches architecture mistakes. Reviewing 4,000 lines of generated code to find the same mistake takes a day. Auto mode works because of the documents, not despite them.

The Architecture Fable Built

Architecture diagram of the streaming microservice: uploads flow into a private S3 mezzanine bucket, EventBridge triggers MediaConvert to produce HLS renditions into a packaged bucket served by CloudFront with signed cookies; a Lambda control plane issues playback sessions backed by DynamoDB; AWS IVS handles real-time streams; other services integrate through one playback-session API

The short answer: a serverless control plane around managed media services. Fable’s HLD argued — correctly — that in 2026 you should not be running your own transcoders or packagers for this workload class, and every component it picked is the boring, durable choice:

S3 (two private buckets) — a mezzanine bucket for original uploads and a packaged bucket for HLS output. Both with Block Public Access on, KMS encryption at rest, and lifecycle rules that expire failed multipart uploads.
AWS Elemental MediaConvert — VOD transcoding. Each video becomes an adaptive ladder (1080p / 720p / 480p / audio-only) of HLS segments; audio files become segmented HLS audio so podcasts get the same instant-seek behavior as video.
EventBridge — glue. ObjectCreated on the mezzanine bucket triggers the transcode orchestrator; MediaConvert job-state changes flow back to update asset status. No polling anywhere.
CloudFront with Origin Access Control + signed cookies — the only public face of media. The packaged bucket is unreachable except through the CDN, and the CDN only serves you with a valid short-lived cookie.
A small Lambda + API Gateway control plane — four routes: request an upload (presigned multipart), check asset status, create a playback session (the integration contract), and create a live channel.
AWS IVS (Interactive Video Service) — the real-time path. Managed RTMPS ingest, 2–5 second latency, an HLS-compatible playback URL that drops into the same player. Fable’s HLD explicitly rejected MediaLive for v1 as cost- and ops-overkill, which matched my instinct exactly.
DynamoDB — asset and session metadata, single-table, with the migration state tracked in the same table under its own key prefix.

The integration contract is the part my other services care about, and it’s one endpoint:

TypeScript

// Before: every service hand-rolled raw file access
const url = await getPresignedUrl(fileId); // 400 MB MP4, good luck

// After: one call, any service, audio or video, VOD or live
const session = await streamSvc.createPlaybackSession({
  assetId,
  viewerId,            // bound to the session, not shareable
  expiresIn: 3600
});
// → { manifestUrl: "https://media.example.com/hls/{assetId}/master.m3u8",
//     cookies: { "CloudFront-Policy": "...", "CloudFront-Signature": "..." } }

Raw-file serving didn’t just get faster — it got deleted as a concept. There is no code path left that hands a full original file to a browser.

The security work I didn’t have to ask for

I gave Fable one sentence of security direction: “private by default, no long-lived credentials, signed playback.” Here’s what it derived from that sentence:

SECURITY FABLE SHIPPED UNPROMPTED

IAM

Least-privilege per function

Each Lambda has its own role scoped to exactly its resources — the playback-session function cannot touch the mezzanine bucket, the upload function cannot read DynamoDB sessions.

EDGE

OAC + signed cookies, short TTL

Buckets are unreachable except via CloudFront Origin Access Control. Playback cookies expire with the session and are scoped to one asset path.

CI/CD

OIDC instead of AWS keys

GitHub Actions assumes a role via OpenID Connect. There is no AWS secret stored in the repo or the CI environment at all.

DATA

KMS at rest, TLS in transit

Both buckets and the DynamoDB table are KMS-encrypted; the API enforces TLS and validates upload content types before issuing presigned URLs.

I still ran my own review pass and an automated code review over the diff — trust but verify is doing heavy lifting in this workflow — and the review came back with style nits, not security findings.

The Migration: Transcoding an Existing Library Without Fear

New architectures are easy; old data is where projects go to die. I had years of raw media sitting in the legacy bucket, all of it needing transcoding into HLS, none of it allowed to break while users were actively consuming it.

Fable’s migration design treated the transcode of the back catalog as a resumable, verifiable batch job, not a script you run and pray over:

HOW THE MIGRATION SCRIPTS WORK

Every stage is idempotent — the whole pipeline can be killed and re-run at any point and it picks up exactly where it stopped.

STAGE 01

Inventory

Walk the legacy bucket, fingerprint each object (ETag + size), and write one migration record per asset into DynamoDB with status PENDING. Re-running only adds new files.
STAGE 02

Queue with a concurrency cap

Submit MediaConvert jobs from the PENDING set, capped to stay inside account quotas and a cost ceiling I set ($/day). Job IDs land back on the migration record.
STAGE 03

Verify, don’t assume

On job completion, compare output duration against source duration (±1s), check every rendition in the ladder exists, and probe the manifest. Only then: status VERIFIED.
STAGE 04

Cut over with a net

Consuming services dual-read: VERIFIED assets stream HLS, everything else falls back to the legacy raw path. When the failure list hit zero, the fallback was removed.

The detail that sold me: Fable added a --dry-run flag and a cost estimate to the queue stage before I asked, because the TRD it had written contained a cost ceiling — so it treated “don’t surprise me on the bill” as a requirement to implement, not a vibe. Eleven assets failed verification on the first pass (corrupted sources, one mislabeled codec). They were exactly the kind of thing a hand-rolled for loop over aws s3 ls would have silently butchered.

CI/CD and Runbooks: The Unsexy 20% That Makes It Real

A microservice without a pipeline is a liability with good intentions. Fable shipped both halves of operability the same afternoon:

GitHub Actions: on PR — lint, type-check, unit tests, terraform plan posted as a PR comment; on merge — gated terraform apply, Lambda deploy, and a canary that creates a real playback session against production and fails the deploy if time-to-first-byte on the manifest regresses.
Runbooks in the repo (/runbooks): Onboarding a new service (the playback-session contract, with copy-paste client code), Live stream operations (create channel, rotate stream key, end-of-stream archive), Transcode failure triage (where MediaConvert errors land, how to requeue one asset), and Cost monitoring (the CloudWatch dashboard + budget alarms it provisioned).

The runbooks are why integrating my first two services took an hour instead of a week of Slack archaeology. The third service was onboarded by a teammate without talking to me at all — they read the runbook Fable wrote, called one endpoint, and shipped. That’s the real productivity story: the agent didn’t just write code, it wrote down how to use the code, which is the part humans chronically skip.

What I Actually Did All Day

Let’s be precise about the human role, because this is where most coverage of agentic coding gets hand-wavy. Every one of my 17 inputs was a product trade-off, a cost ceiling, or a risk-tolerance call — never a technical how-to. These five carried the most weight:

THE DECISION LOG — WHERE THE HUMAN EARNED THEIR KEEP

Fable framed each question with the trade-offs already researched; my job was to bring the context only I had.

THE QUESTION 01

HLS only, or HLS + DASH?
Protocol

THE CALL

HLS-only for v1, skip DASH entirely.

WHY IT WAS MINE TO MAKE

My audience skews mobile and Safari, where HLS is native. One protocol halves the test matrix — and DASH can be added later without re-transcoding.
THE QUESTION 02

AWS IVS or MediaLive for the realtime path?
Realtime

THE CALL

IVS — managed ingest, 2–5 second latency.

WHY IT WAS MINE TO MAKE

Pure cost/ops trade-off. Sub-second latency wasn't worth 10x the operational surface for my use case. Only I knew the latency my product actually needs.
THE QUESTION 03

Signed URLs or signed cookies for playback?
Security

THE CALL

Signed cookies, scoped per playback session.

WHY IT WAS MINE TO MAKE

Product stance: paid content shouldn't be hot-linkable, and per-segment URL signing wrecks CDN cache efficiency. A judgment call about the business, not the tech.
THE QUESTION 04

Lambda or Fargate for the control plane?
Compute

THE CALL

Lambda, accepting cold starts.

WHY IT WAS MINE TO MAKE

My traffic is spiky with long idle valleys — scale-to-zero wins. Someone with steady traffic should choose the opposite, which is exactly why the agent asked.
THE QUESTION 05

How aggressive should the migration cutover be?
Migration

THE CALL

Dual-read fallback until verified failures hit zero.

WHY IT WAS MINE TO MAKE

Risk tolerance is a business decision. I chose the slow-and-safe path because live users were consuming this media during the migration.

Plus the review passes: each design doc, the Terraform plan before the first apply, the IAM policies line by line, and the final diff. Call it two hours of genuine attention across the nine.

Notice what’s not on the list: I never wrote a handler, never created a resource in the AWS console (my single console visit was confirming the budget alarm fired during a test), never debugged a YAML indentation error. Every hour I spent was on decisions that needed my context — which is exactly the division of labor I’ve been arguing the agent landscape was heading toward. It also confirmed something smaller but important: a well-maintained CLAUDE.md with your conventions is the cheapest force multiplier in this whole setup — Fable followed my repo conventions because they were written down where it looks.

Where It Stumbled (Because Nothing Is Magic)

Honesty section. Three real friction points:

First MediaConvert ladder was over-provisioned. The draft included a 4K rendition my content doesn’t have sources for. Caught in LLD review — a 30-second fix at doc stage, but it would have quietly doubled transcode costs if I’d rubber-stamped it.
One quota assumption. Fable assumed default MediaConvert concurrent-job quotas; my account had a lower legacy limit. The migration’s capped queue absorbed it gracefully (jobs just drained slower), and Fable filed the quota-increase request when the throttling showed up in logs — but it discovered the limit by hitting it, not by checking first.
It’s only as good as your brief. I forgot to mention that some legacy audio was in a deprecated codec. The verification stage caught all of them, but a better inventory in my brief would have saved a requeue cycle.

None of these are “AI wrote bad code” stories. They’re the same integration realities any senior engineer hits — the difference is the system was designed (by the agent, in the TRD) to surface them loudly instead of corrupting silently.

Best Practices: How to Run an Auto-Mode Build

THE AUTO-MODE PLAYBOOK

Track progress as you work through the list

0/8 done

Write a one-page brief with constraints and non-goals — ambiguity in, ambiguity out critical
Demand PRD → TRD → HLD → LLD before code, and actually read them; doc review is where you earn your salary critical
Set explicit cost ceilings in the brief — the agent will engineer them in as requirements high
Review IAM policies and the first terraform plan line by line, even if you skim everything else critical
Require idempotent, resumable, verifying migrations — never one-shot scripts over production data high
Make the agent write runbooks while context is hot; docs written same-day are docs that exist high
Keep a dual-read fallback through any cutover, and remove it only on measured zero failures medium
Run an independent code-review pass over the final diff — trust, then verify medium

FAQ

Questions readers usually have

The questions I've been asked since posting the before/after numbers.

Final Take

The headline isn’t “AI wrote a lot of code fast.” Code generation has been cheap for two years. The headline is that Claude Fable 5 ran an engineering process — requirements, design review, infrastructure, security posture, migration safety, operations docs — and the process is what made one-day delivery survivable instead of reckless.

My role didn’t shrink; it concentrated. Seventeen decisions and two hours of review were the entire human footprint, but they were the seventeen decisions that determine whether this service is still standing in two years. That’s the trade every senior engineer should want.

If you’re going to try this, start with the playbook above and a service you actually need — not a toy. The toys don’t force the migration, the security review, or the runbooks, and those are exactly where auto mode earns its keep.

If you found this useful, read spec-driven development with AI agents next — it’s the methodology that makes builds like this one repeatable instead of lucky.

Sources

Written for umesh-malik.com — no-fluff technical writing on AI, Web Dev, and Engineering.

A presigned URL and a prayer

Adaptive, signed, instant

The brief

PRD, TRD, HLD, LLD

Infrastructure + backend

Frontend + CI/CD

Migration of the back catalog

Integration + runbooks

Least-privilege per function

OAC + signed cookies, short TTL

OIDC instead of AWS keys

KMS at rest, TLS in transit

Inventory

Queue with a concurrency cap

Verify, don’t assume

Cut over with a net

HLS only, or HLS + DASH?

AWS IVS or MediaLive for the realtime path?

Signed URLs or signed cookies for playback?

Lambda or Fargate for the control plane?

How aggressive should the migration cutover be?

What is Claude Fable 5 auto mode?

Did the AI really create all the AWS resources itself?

How does the realtime streaming part work?

How were existing raw videos and audio files migrated?

Is it safe to let an AI agent manage security?

How much did I actually have to do?

Related Articles

Is Claude Code Auto Mode Reliable in Production? A Field Report

Claude Code vs Cursor for Production: A Shipping Engineer's Field Report (2026)

Can You Use Claude Code and Codex for Free? Honest 2026 Guide

Get new posts on AI, Claude Code & LLMs

Explore Topics