MCP cost — what a tool call actually costs (2026 guide)

The cost equation

Every MCP tool call has two cost dimensions: the input tokens the model sends to the server (the tool arguments + context) and the output tokens the server returns. Both are billed by the LLM provider. The formula:

cost_usd = (input_tokens / 1_000_000) * input_rate
         + (output_tokens / 1_000_000) * output_rate

With Claude Sonnet 4.6 — the most common model in agentic workflows — rates are $3 / M input and $15 / M output. A single Playwright browser_snapshot call that returns a 15K-token DOM payload therefore costs about 0.225 USD just for the output, and a couple more cents for the round-trip context.

Per-server averages (real telemetry)

These averages come from anonymized aggregated data across MCPSpend users. Heavy users see more, lighter users less — but the relative ranking is stable across every account we've looked at.

Server	Avg input tokens / call	Avg output tokens / call	Typical % of monthly spend
playwright (browser tools)	2,000	12,000	40-60%
fetch / read-url	300	8,000	15-25%
filesystem (read_file)	300	4,000	10-15%
github (diff/issues)	500	6,000	5-10%
brave-search / web-search	300	3,000	5-10%
postgres / database	400	2,500	3-8%
slack / linear / notion	500	1,500	2-5%

Why browser tools dominate

Playwright and Windsurf's cascade-browser return full DOM snapshots of every page the agent visits. Modern web pages are dense — a single React-rendered SaaS dashboard can push 20-30K tokens into the context window per snapshot.

Multiply by the 5-10 page navigations a typical research task involves, and one task can run $0.50-$2.00 just on browser tools. Most users have no idea this is happening because provider invoices show only the aggregate.

What to optimize, in order

Cap Playwright DOM payloads. Most Playwright MCP servers accept a --max-bytes flag (or equivalent). 8KB is usually enough for navigation decisions; the agent rarely needs the whole DOM.
Disable browser MCPs for non-browsing tasks. If the agent isn't actively scraping or interacting with a webpage, the browser server shouldn't be loaded. Toggle it on per-task.
Cache fetch responses. Reading the same URL twice in a conversation is wasteful. A simple in-memory LRU layer on top of fetch cuts repeat costs by 30-40%.
Switch to Haiku for routine work. Claude Haiku 4.5 is $0.80 / $4 per M — that's ~4× cheaper than Sonnet. For file-reading, search, and routing tasks, Haiku is more than enough.
Set per-agent budgets. Background agents (CI jobs, scheduled scrapers) can run away — a budget alert at 80% saves you from a surprise bill.

How MCPSpend measures this

We're a transparent stdio proxy. Your MCP client (Claude Desktop, Cursor, Windsurf, VS Code, Claude Code) launches us as a subprocess; we launch your real MCP server as our subprocess, and pipe stdin/stdout between them unchanged. Every JSON-RPC frame is observed as it flows. Each tool-call request is timestamped; the matching response gives us the latency and the byte counts. Cost is computed using public provider rates.

We never read tool arguments or tool responses — only the envelope (server name, tool name, timing, byte counts). Source is MIT-licensed at github.com/andreisirbu91-lab/MCPSpend and you can audit the ingest payload yourself.

Estimate your bill in 30 seconds

The free calculator takes your active agent hours and the MCP servers you use, then projects monthly spend using these real averages. No signup, no email.

Or install the proxy for exact numbers from your actual traffic:

npx --yes @mcpspend/proxy@latest init --key mcps_live_xxx

25,000 tool calls per month on the free tier. No card. Generate your key →

MCP cost: what a tool call actually costs

The cost equation

Per-server averages (real telemetry)

Why browser tools dominate

What to optimize, in order

How MCPSpend measures this

Estimate your bill in 30 seconds

Track your own MCP spend — free

Related guides

Cursor MCP cost

Pricing comparison

Try the calculator