Stop Paying Twice for Your Cache When a Refusal Forces a Model Switch

Re-prices the retried prefix from the ~1.25x cache-write rate to the ~0.1x read rate (no saving unless a cross-model refusal-retry actually fires) Advanced 2 min read

Prompt caches are per-model, so retrying a refused request on a fallback model normally re-writes the whole cached prefix at the write rate. Anthropic's fallback credit re-bills it as a cache read instead.

🔒 Pro tip · Advanced

Unlock this tip — and 50 more

This is one of 51 advanced, fact-checked tactics reserved for Pro. Get the full 73-tip library, a searchable archive, and a new tip every morning. Free for 7 days, then $9/mo.

Prefer to browse? The 22 Beginner tips are free forever.