Give Each Feature a Token Budget and Enforce It with max_tokens

20-50% on output-heavy features Intermediate 1 min read

Set an explicit per-feature output ceiling instead of leaving max_tokens at a huge default, and reserve big budgets only for features that truly need them.

🔒 Pro tip · Intermediate

Unlock this tip — and 105 more

This is one of 106 advanced, fact-checked tactics reserved for Pro. Get the full 128-tip library, a searchable archive, and a new tip every morning. Free for 7 days, then $9/mo.

Start your 7-day free trial Already Pro? Sign in

Prefer to browse? The 22 Beginner tips are free forever.

More in Measurement & Budgeting

📊Measurement & Budgeting 10-30% on prompts you would have sent bloated

Count Tokens Before You Hit Send, Not After the Bill Arrives

Measure a prompt's token count before sending it, so you catch oversized context while trimming it is still free.

Beginner Read →

📊Measurement & Budgeting Prevents runaway-loop and leaked-key blowups; bounds worst-case spend rather than reducing normal usage

Set Hard Spend Caps in the Provider Console

Configure provider-side usage limits, budgets, and alerts so a bug, a retry storm, or a leaked key cannot quietly run your bill far past a ceiling you set in advance.

Beginner Read →

📊Measurement & Budgeting Caps the tail: kills 10-50x blowouts on runaway agent sessions

Cap the loops by count, not tokens: spawn and search limits in Claude Code

Claude Code 2.1.212 added session caps on subagent spawns and WebSearch calls, both defaulting to a non-binding 200. Lower them to match your actual workload so a mis-planned agent hits a hard stop at 12 searches instead of quietly burning a full context window and the spend behind it.

Intermediate Read →