Core Concepts

Models and capabilities

Choose models by task type, latency tolerance, context size, and modality requirements like vision.

Model selection strategy

Use stronger models for architectural changes, complex refactors, and ambiguous tasks.
Use faster/cheaper models for short iterations, triage, and lightweight edits.
For visual tasks, select models that support vision capabilities.

Gemini 3 Flash: strong speed and helpful for design/UI iteration loops.
Kimi K2.5 and MiniMax M2.5: strong value picks for cost-conscious throughput.
Grok Code Fast 1: low-cost fast iteration when precision requirements are lower.
Claude Sonnet 4.6 and GPT-5.4 remain available in the desktop app through managed local runners rather than the server runtime.

Defaults

Current server-runtime defaults are Kimi K2.5 for free accounts and Gemini 3 Flash for paid accounts.

Gemini 3 Flash: input $0.5, output $3, cached input $0.05, context 1M, vision support.
Kimi K2.5: input $0.6, output $2.5, cached input $0.1, context 128k, vision support.
MiniMax M2.5: input $0.3, output $1.2, cached input $0.03, cache-creation input $0.38, context 205k.
Managed local-runner models such as Claude Sonnet 4.6 and GPT-5.4 keep their own pricing metadata in the desktop app.

Cost data changes over time

Use this as practical guidance, then confirm current values in-app before making strict budget decisions.