Usage & Billing
Tokens and context
Token usage and context limits define both performance and spend behavior across a session.
Token overview
Tokens are the billing and processing unit for model inputs and outputs. Every request consumes input tokens, and each response adds output tokens.
- Large context means larger input token counts per turn.
- Long responses increase output token usage.
- Cached input, where available, can reduce part of input cost.
How to manage usage
- Scope sessions tightly and avoid unrelated multi-topic prompts.
- Compact context when pressure rises.
- Finish a task and start fresh for the next one to reduce carry-over token burden.