Browse docs

Usage & Billing

Tokens and context

Token usage and context limits define both performance and spend behavior across a session.

Token overview

Tokens are the billing and processing unit for model inputs and outputs. Every request consumes input tokens, and each response adds output tokens.

  • Large context means larger input token counts per turn.
  • Long responses increase output token usage.
  • Cached input, where available, can reduce part of input cost.

How to manage usage

  • Scope sessions tightly and avoid unrelated multi-topic prompts.
  • Compact context when pressure rises.
  • Finish a task and start fresh for the next one to reduce carry-over token burden.