Skip to main content

Deploy an MCP Server on Cloudflare Workers (Free, Stateless, at the Edge)

Deploy an MCP server on Cloudflare Workers: wrangler.toml, the run_worker_first model, routing /mcp, local testing, and going live on the free tier.

11 min read
Deploying a Model Context Protocol server to the edge on Cloudflare Workers

You built an MCP server — a JSON-RPC handler with a few well-described tools. Now it has to live somewhere an agent can reach it, 24/7, without you babysitting a server. Cloudflare Workers is close to the ideal host for this, and most of the reasons come down to one property of a read-only MCP server: it’s stateless.

This is the deployment half of the story. If you haven’t written the server logic yet, start with How to Build a Production MCP Server — this post picks up where that one ends and gets it onto the edge, on the free tier, on your own domain.

I run exactly this setup for my own site. Here’s the whole thing.

TL;DR

  • A read-only MCP server is stateless, which is precisely what edge runtimes do best — so Workers is a natural fit, not a compromise.
  • The entire deploy is a wrangler.toml, one wrangler deploy, and a route check for /mcp.
  • run_worker_first = true is the setting people miss — it lets your Worker intercept /mcp before the static-assets binding serves a file.
  • Wrangler needs Node.js 22+. This is the single most common “it works in CI but not on my machine” gotcha.
  • The free tier (100k requests/day) comfortably covers a personal or documentation MCP server.

Why Workers is the right host

The defining trait of a read-only MCP server — one whose tools only fetch data — is that it holds no state between requests. Every tools/call is self-contained. That single fact knocks out the usual reasons you’d reach for a long-lived Node process:

  • No session store, so nothing to persist between requests.
  • No warm-up, so cold starts don’t hurt — there’s no database connection pool to spin up.
  • Embarrassingly parallel, so horizontal scaling is automatic.

Stateless request/response at global scale is the edge-function sweet spot. Add the practical wins — runs in 300+ locations near your users, scales to zero when idle, and the free tier handles 100,000 requests/day — and Workers stops being a creative choice and becomes the obvious one.

💡 Key insight: Don’t add a database or sessions to an MCP server that only reads. Statelessness isn’t a limitation here — it’s the feature that makes edge hosting trivial.

The whole config: wrangler.toml

Here’s the real wrangler.toml running my server. It does three jobs: point at the Worker, bind the static assets, and run the Worker first.

name = "my-site"
compatibility_date = "2024-01-01"
main = "worker/index.ts"

[assets]
directory = "./build"
binding = "ASSETS"
# Run the Worker before serving static assets so our routes (like /mcp)
# are intercepted before the assets binding can short-circuit them.
run_worker_first = true

That’s the core of it. main is your Worker entry. The [assets] block lets the same Worker also serve a static site from ./build — handy if, like me, your MCP server lives alongside a real website. If your server is standalone, you can drop the assets block entirely.

The setting everyone misses: run_worker_first

When you attach a static-assets binding, Cloudflare’s default is to check for a matching file first and only fall through to your Worker if there’s no file. That’s great for a plain static site — and quietly broken for an API route.

Without run_worker_first = true, a request to /mcp can get intercepted by the assets layer before your Worker ever sees it. Set it to true and the order flips: your Worker runs first, handles /mcp, and explicitly serves static files for everything else via env.ASSETS.fetch().

If you ever see your MCP endpoint returning a 404 or an HTML page instead of JSON-RPC, this flag is the first thing to check.

Routing the endpoint

With the Worker running first, routing is a path check at the top of fetch, before the asset fallback:

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const url = new URL(request.url);

    // MCP endpoints — handled before anything else
    if (url.pathname === '/mcp') {
      return handleMcp(request, env.ASSETS);
    }
    if (url.pathname === '/.well-known/mcp/server-card.json') {
      return mcpServerCard(request);
    }

    // Everything else: serve the static site
    return env.ASSETS.fetch(request);
  }
} satisfies ExportedHandler<Env>;

Notice handleMcp receives env.ASSETS. That’s deliberate: my tools are backed by files the site already publishes (a JSON feed, Markdown pages), and the Worker reads them through the same assets binding. One source of truth, zero duplicated data — the deployment story and the data story are the same story.

// inside a tool: read an asset the site already serves
const res = await assets.fetch(new URL('/feed.json', origin));

Local development

Test before you ship. wrangler dev runs the Worker and serves the static assets locally:

npx wrangler dev
# Ready on http://localhost:8787

Then exercise it with curl — no special client needed:

curl -s -X POST http://localhost:8787/mcp 
  -H 'Content-Type: application/json' 
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

⚠️ The Node version gotcha: recent Wrangler (v4+) requires Node.js 22 or newer. If wrangler dev or wrangler deploy errors with a version complaint, you’re on an older Node. Switch with nvm use 22 (or fnm). This is the number-one reason a deploy works in CI but fails locally.

Going live

Two ways, pick one:

Manual deploy — one command:

npx wrangler deploy

It bundles the Worker (esbuild, no config needed), uploads your ./build assets, and your server is live globally in seconds.

Git integration (what I use) — connect the repo in the Cloudflare dashboard and every push to main builds and deploys automatically. The build command runs your site build, and the Worker deploys alongside it. After that, publishing is just git push.

Either way, your MCP endpoint is live at https://yourdomain.com/mcp — on your own domain, because the Worker is serving that domain. No separate subdomain, no extra DNS.

Common mistakes

  • Forgetting run_worker_first. Your /mcp route returns HTML or 404 because the assets binding ate the request. The fix is one line.
  • Running an old Node. Wrangler v4 needs Node 22+. The error is clear once you read it, but easy to miss in CI logs.
  • Adding state you don’t need. Durable Objects and KV are great tools — and overkill for a read-only server. Stay stateless until a tool genuinely requires continuity.
  • Not handling OPTIONS/CORS. Browser-based MCP clients send a preflight. Return CORS headers and handle OPTIONS, or those clients silently fail.
  • Hardcoding the origin. Build asset URLs from the incoming request’s origin so the same code works on localhost, preview deploys, and production.

Best practices

  1. Stay stateless. It’s the whole reason Workers fits. Earn your way into KV/Durable Objects only when a tool needs memory.
  2. Reuse the assets binding for data. If your server sits alongside a site, read the files it already publishes instead of duplicating content.
  3. Cache where you can. Read-only tool data is cacheable — set Cache-Control on responses backed by static assets.
  4. Pin your Node version. Document Node 22+ in your README and CI so “works on my machine” stays true everywhere.
  5. Test the lifecycle locally. initializetools/listtools/call, plus the edges, against wrangler dev before every deploy.
  6. Use your own domain. Serving /mcp from your primary domain is a stronger trust and discovery signal than a throwaway subdomain.

Conclusion

Hosting an MCP server sounds like infrastructure work and turns out to be a config file. The reason it’s that easy is the reason worth internalizing: a read-only MCP server is stateless, and stateless request/response at global scale is exactly what the edge is for. wrangler.toml, run_worker_first, one deploy, your own domain. That’s it.

Build the server logic in How to Build a Production MCP Server, then ship it with this. For where MCP fits in the bigger agent picture, see AI Coding Agents — Agentic AI for Developers and LLM Engineering, or read the Cloudflare Workers docs for the platform details.

Explore more: AI Coding Agents · LLM Engineering · Claude Code

Share this article:
X LinkedIn