Design a Compact Output Schema (and Skip the Pretty-Printing)

Often 20-50% on structured outputs vs. verbose JSON or prose tables Advanced 1 min read

When you need structured data, the shape you ask for directly determines token count. Short keys, no markdown scaffolding, and minified output cut tokens on every response — and the input echo if you loop.

🔒 Pro tip · Advanced

Unlock this tip — and 105 more

This is one of 106 advanced, fact-checked tactics reserved for Pro. Get the full 128-tip library, a searchable archive, and a new tip every morning. Free for 7 days, then $9/mo.

Start your 7-day free trial Already Pro? Sign in

Prefer to browse? The 22 Beginner tips are free forever.

More in Output Control

📐Output Control Shrinks classification and routing outputs substantially, frequently 5-15x fewer output tokens per call

Return IDs and Enums, Not Sentences

For classification, routing, and selection tasks, have the model emit a short code, ID, or enum value instead of a polite sentence. The downstream code only needs the token, not the prose around it.

Beginner Read →

📐Output Control Often 30-60% fewer output tokens on short tasks

Strip the Preamble: Ask for the Answer Only

Chat models love to restate your question, add caveats, and offer follow-ups. On high-volume tasks those wrapper tokens dominate the bill. Tell the model to return only the payload.

Beginner Read →

📐Output Control Caps runaway costs; output tokens are typically 3-5x the input price

Set max_tokens as a Hard Cost Ceiling, Not an Afterthought

Output tokens are the expensive half of most API bills. Setting an explicit max_tokens on every API call turns an open-ended cost into a known maximum.

Beginner Read →