Delv

MCP Token Ledger

Every MCP server you connect injects its tools/list response into the model context on every turn. This page measures that cost so you can budget your window before the model starts drifting. Heavy servers are often the reason a stack feels sluggish — a single 7,000-token MCP eats more context than most conversations.

Entries measured
18
Heaviest
7,400 tok
github-official
Combined cost
55,951 tok
if you installed all of these
Tokenizer
est. 3.5ch
harness + editorial seeds
ServerToolsTokensCost
GitHub (Official)
DEV_TOOLS
277,400
Heavy
Playwright (Microsoft)
BROWSER
326,800
Heavy
Stripe
INTEGRATION
245,100
Medium
Jira (Atlassian)
DEV_TOOLS
244,800
Medium
Linear
PRODUCTIVITY
224,200
Medium
Slack
COMMUNICATIONS
183,800
Medium
Filesystem
FILESYSTEM
143,493
Medium
Notion
PRODUCTIVITY
163,400
Medium
Google Drive
FILESYSTEM
143,100
Medium
Memory
MEMORY
92,802
Light
Gmail
COMMUNICATIONS
112,700
Light
Sentry
DEV_TOOLS
122,400
Light
Supabase
DATABASE
102,200
Light
sequentialthinking11,287
Light
Postgres
DATABASE
61,100
Light
Puppeteer
BROWSER
7699
Trivial
Brave Search
SEARCH
2415
Trivial
everart1255
Trivial
Methodology

Harness measurements spawn each MCP server over stdio, send the initialize and tools/list JSON-RPC messages, and count tokens in the serialised tools response. Tokens are estimated at roughly 3.5 characters per token — consistent across servers, within 15% of exact tokenizer output, and good enough for ranking.

Editorial estimates are used for servers we can't spawn locally (commercial SaaS, auth-required, Python-only). They get replaced the moment we have real harness data.

Why this matters: context is a zero-sum resource. A 6,000-token MCP on a 200K-context model costs 3% of your window every turn, every conversation, forever. Stacking three heavy MCPs can burn 10%+ of context before the model has read a single user message.