Models
ostk is model-agnostic. Six production providers, one local runtime, one on-device path. Every provider speaks through the same CpuDriver trait — but they don't all support the same features. This page tells you what works where.
PROVIDERS
claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5 ANTHROPIC_API_KEY src/cpu/anthropic.rs:257–297 gemini-2.5-pro, gemini-2.0-flash, gemini-3.1-pro, gemini-3-flash GEMINI_API_KEY (or GOOGLE_API_KEY) src/cpu/gemini.rs:32–70, 299–319 mistral-large-latest, codestral-latest, devstral-small, devstral-medium-2507, magistral-* MISTRAL_API_KEY src/cpu/mistral.rs:24–45, 145–220 gpt-4o, o1-*, o3-pro, o4-* OPENAI_API_KEY src/cpu/openrouter.rs (shared driver), model_registry.rs:84–117 Any provider/model (meta-llama/*, deepseek/*, etc.) OPENROUTER_API_KEY src/cpu/openrouter.rs:12–51 local/codestral:22b, local/qwen2.5-coder:32b, local/llama3.3:70b, local/deepseek-r1:70b, local/gemma3:27b None required (OLLAMA_HOST to override endpoint) src/cpu/mod.rs:437–454, providers.rs:62–68 apple/default None (APPLE_MODEL_HOST to override) src/cpu/mod.rs:439–446, model_registry.rs:196–201 How FROM auto Picks a Model
When an Agentfile says FROM auto (or no FROM at all), ostk scores available models based on which API keys are present. Source: src/commands/run.rs:89–197.
OSTK_MODEL env var overrides everything — if set, it's used regardless of HUMANFILE or FROM. Source: src/cpu/context.rs:116.
CONFIGURATION
Four places to set a model, in priority order:
OSTK_MODEL=gemini-2.5-pro ostk run agents/worker.af FROM claude-sonnet-4-6 MODEL claude-opus-4-6 ostk kernel spawn worker --model gemini-2.5-pro "task" FEATURE_MATRIX
Not every provider supports every feature. The CpuDriver trait provides a common surface, but the underlying APIs vary. This matrix shows what actually works per provider at the driver level.
| Feature | Anthropic | Gemini | Mistral | OpenAI | Ollama |
|---|---|---|---|---|---|
| Tool use | ✓ | ✓ | ✓ | ✓ | ✓ |
| Streaming | ✓ | ✓ | ✓ | ✓ | ✓ |
| Extended thinking | ✓ | ✓ (3.x) | — | built-in* | — |
| Prompt caching | ✓ | server | — | — | — |
| Vision/images | ✓ | ✓ | ✓ | ✓ | model |
| File upload API | ✓ | — | — | — | — |
| Token counting | ✓ | — | — | — | — |
| Batch API | ✓ | — | — | — | — |
| Speed mode | ✓ | — | — | — | — |
| Citations | ✓ | — | — | — | — |
✓ = ostk driver implements it. — = not available in the driver. server = handled server-side, no client config. built-in* = reasoning is a model behavior, not a driver feature (o1/o3/o4 always think). model = depends on the specific model loaded in Ollama.
MODEL_SPECIFIC_NOTES
thinkingConfig.includeThoughts: true and thinkingLevel: "HIGH". Gemini 2.x models get standard config only. Source: gemini.rs:105–121.
toolu_ prefixed IDs that are longer. ostk remaps them transparently. Source: mistral.rs:191–220.
codestral:22b) is auto-routed to Ollama without the local/ prefix. Source: model_registry.rs:251–253.
OPENAI_API_KEY is not set but OPENROUTER_API_KEY is, OpenAI models (gpt-4o, o1-*, o3-*) route through OpenRouter automatically. Source: providers.rs:76–81.
model_registry.rs:391–398. Handles the reasoning_content field in responses.
QUICK_EXAMPLE
MODEL claude-sonnet-4-6 FALLBACK gemini-2.5-pro
FROM local/codestral:22b PROMPT Run eslint, fix warnings. TOOL shell LIMIT budget_usd 0
The HUMANFILE sets the project default (Sonnet) with a fallback (Gemini). The lint agent pins a local model via FROM — free, fast, and sufficient for the task. The budget is $0 because local models have no API cost. Model mixing is the normal case, not an edge case.