Core Concepts
Models and capabilities
Choose models by task type, latency tolerance, context size, and modality requirements like vision.
Model selection strategy
- Use stronger models for architectural changes, complex refactors, and ambiguous tasks.
- Use faster/cheaper models for short iterations, triage, and lightweight edits.
- For visual tasks, select models that support vision capabilities.
Current curated model guidance
- Claude Sonnet 4.5: balanced reliability for everyday production work.
- GPT-5.2 Codex: strong coding model for deep technical tasks.
- Gemini 3 Flash: strong speed and helpful for design/UI iteration loops.
- Kimi K2.5 and MiniMax M2.5: strong value picks for cost-conscious throughput.
- Grok Code Fast 1: low-cost fast iteration when precision requirements are lower.
Defaults
Current defaults are Kimi K2.5 for free tier and Claude Sonnet 4.5 for paid tier.
Capabilities and costs (per 1M tokens)
- Claude Sonnet 4.5: input $3, output $15, cached input $0.30, context 200k, vision support.
- GPT-5.2 Codex: input $1.75, output $14, cached input $0.175, context 400k, vision support.
- Gemini 3 Flash: input $0.5, output $3, cached input $0.05, context 1M, vision support.
- Kimi K2.5: input $0.6, output $2.5, cached input $0.1, context 128k, vision support.
- MiniMax M2.5: input $0.3, output $1.2, cached input $0.03, cache-creation input $0.38, context 205k.
Cost data changes over time
Use this as practical guidance, then confirm current values in-app before making strict budget decisions.