Start a New Chat When the Topic Changes

Chat interfaces feel free-flowing, but under the hood they are stateless: most send the entire prior conversation back to the model on every new message. There is no server-side memory of your turns in a standard API or chat call — each request re-transmits the accumulated history as input tokens. So a 30-message thread means your latest question rides on top of 29 previous exchanges that get re-sent every time.

Before (wasteful):

(One 50-message thread) You debug a Python script, then brainstorm blog titles, then ask for a SQL query — all in the same chat. By message 50, every request re-sends 40+ irrelevant turns about Python and blog titles.

After (lean):

Finish the Python debugging in chat A. Move to blog titles? Open chat B. SQL query? Chat C. Each request carries only the history that is actually relevant.

Why it saves tokens

Input tokens scale with conversation length. If your thread is ~20k tokens, every follow-up re-sends roughly 20k input tokens before your new question is even processed. A fresh chat resets that baseline to near zero.

One important nuance: most major tools (ChatGPT, Claude, Gemini) apply prompt caching automatically, so a repeated history prefix is usually billed at a steep discount (on the order of one-tenth of normal input price) rather than full price. That softens the hit — but it doesn't erase it:

Cache reads still cost money, every turn, for content you no longer need.
Caches expire (commonly a 5-minute default), so an idle thread you return to pays full price again on the next message.
Bloated, off-topic context can also blur the model's focus and degrade answer quality.

So the win is real; it's just smaller per turn than the raw token count implies.

When to start fresh

The subject genuinely changed (new task, new file, new domain).
The model keeps "remembering" outdated decisions you've since reversed.
The thread is long enough that the tool warns you or responses slow down.

When NOT to

If the new task truly builds on earlier context, keep the thread — re-pasting that context into a new chat just forces a fresh (slightly pricier) cache write and re-sends it anyway. The win comes specifically from discarding history you no longer need. A good habit: one thread per coherent task. It keeps cost roughly proportional to relevance and tends to sharpen the model's focus too.

Start a New Chat When the Topic Changes

Why it saves tokens

When to start fresh

When NOT to

Get a fresh tip every morning

More in Context Management

Drop the Screenshot Once the Model Has Read It

Paste the Function, Not the Whole File

Run /context and Strip MCP Tool Schemas You Never Call