Auto Router
The Auto Router (openrouter/auto) automatically selects the best model for your prompt, powered by NotDiamond.
Overview
Instead of manually choosing a model, let the Auto Router analyze your prompt and select the optimal model from a curated set of high-quality options. The router considers factors like prompt complexity, task type, and model capabilities.
Usage
Set your model to openrouter/auto:
Response
The response includes the model field showing which model was actually used:
How It Works
- Prompt Analysis: Your prompt is analyzed by NotDiamond’s routing system
- Model Selection: The optimal model is selected based on the task requirements
- Request Forwarding: Your request is forwarded to the selected model
- Response Tracking: The response includes metadata showing which model was used
Session Stickiness
The Auto Router pins both the selected model and provider so that subsequent requests in the same conversation route to the same place. This ensures consistent behavior within a conversation and maximizes prompt cache hits.
Stickiness applies at two levels:
- Implicit (automatic): OpenRouter derives a conversation fingerprint from your messages (hashing the first system message and first user message). Once the provider reports prompt cache usage, the model and provider are pinned for that conversation. No configuration needed.
- Explicit (
session_id): When you include asession_id, stickiness kicks in on the first successful response — even before cache usage is observed. This is recommended for multi-turn conversations and agent workflows where you want consistent routing from the start.
In both cases, the cache expires after 5 minutes of inactivity. Each successful request resets the timer. If the cached provider returns an error, the cache is not updated, allowing the next request to be re-routed.
For full details on how sticky routing works, cache key granularity, and the x-session-id header, see Provider Sticky Routing.
Example with session_id
Why It Matters for the Auto Router
Unlike using a fixed model, the Auto Router selects a different model each time based on your prompt. Session stickiness is especially important here because it also pins the model selection — not just the provider. Without it, you could get different models on each turn of a conversation, leading to inconsistent behavior and wasted prompt cache.
Supported Models
The Auto Router selects from a curated set of high-quality models including:
Model slugs change as new versions are released. The examples below are current as of December 4, 2025. Check the models page for the latest available models.
- Claude Sonnet 4.5 (
anthropic/claude-sonnet-4.5) - Claude Opus 4.5 (
anthropic/claude-opus-4.5) - GPT-5.1 (
openai/gpt-5.1) - Gemini 3.1 Pro (
google/gemini-3.1-pro-preview) - DeepSeek 3.2 (
deepseek/deepseek-v3.2) - And other top-performing models
The exact model pool may be updated as new models become available.
Configuring Allowed Models
You can restrict which models the Auto Router can select from using the plugins parameter. This is useful when you want to limit routing to specific providers or model families.
Via API Request
Use wildcard patterns to filter models. For example, anthropic/* matches all Anthropic models:
Via Settings UI
You can also configure default allowed models in your Plugin Settings:
- Navigate to Settings > Plugins
- Find Auto Router and click the configure button
- Enter model patterns (one per line)
- Save your settings
These defaults apply to all your API requests unless overridden per-request.
Pattern Syntax
When no patterns are configured, the Auto Router uses all supported models.
Cost / Quality Tradeoff
Control how aggressively the Auto Router optimizes for cost vs. quality using the cost_quality_tradeoff parameter (integer, 0–10):
- 0 = pure quality — always picks the most capable model regardless of cost
- 10 = maximize for cost — cheapest model wins
- Intermediate values blend quality and cost signals continuously
The default is 7, which balances cost savings with strong output quality.
Via API Request
Via Settings UI
You can also set a default tradeoff in your Plugin Settings under Auto Router. The per-request value overrides this default.
Pricing
You pay the standard rate for whichever model is selected. There is no additional fee for using the Auto Router.
Use Cases
- General-purpose applications: When you don’t know what types of prompts users will send
- Cost optimization: Let the router choose efficient models for simpler tasks
- Quality optimization: Ensure complex prompts get routed to capable models
- Experimentation: Discover which models work best for your use case
Limitations
- The router requires
messagesformat (notprompt) - Streaming is supported
- All standard OpenRouter features (tool calling, etc.) work with the selected model
Related
- Body Builder - Generate multiple parallel API requests
- Latest Model Resolution - Always target the newest version of a model family
- Model Fallbacks - Configure fallback models
- Provider Selection - Control which providers are used