60 ways to spend fewer tokens
The 22 Beginner tips are free to read. The 38 advanced tactics unlock with Pro — plus a fresh tip in your inbox every morning.
Stop Paying for 'Please' and 'I Was Wondering If'
Conversational filler and apologetic framing get tokenized and billed like any other text. Strip the social padding and lead with the instruction.
Fence the Output Before It Wanders
State the constraints that usually trigger a do-over up front, so you don't pay for a second generation just to strip the preamble.
Ask for the Diff, Not the Director's Cut
When revising a long artifact, request only the changed lines as a patch instead of having the model reprint the whole thing.
Two Sharp Examples Beat Eight Bloated Ones
Few-shot examples are usually the heaviest part of a prompt. Trim each one to the minimum that demonstrates the pattern, and use the fewest that hold accuracy.
Reference Your Data, Don't Re-Paste It Every Turn
In chat UIs and stateless APIs, re-pasting the same document or spec into every message silently multiplies your input cost. Send it once and refer back.
Factor Your System Prompt and Cap the Output
Move stable rules into a reusable, cacheable system prompt once, and constrain the response so the model can't ramble — output tokens usually cost more per token than input.
Try Zero-Shot Before You Pay for Examples
Examples are recurring input tokens on every call — test whether a crisp instruction does the job before you attach them by default.
Like what you see?
Get a fresh one in your inbox — weekly free, daily on Pro.