Auto Router | Smart AI Model Selection | OpenRouter

The Auto Router (openrouter/auto) automatically selects the best model for your prompt, powered by NotDiamond.

Overview

Instead of manually choosing a model, let the Auto Router analyze your prompt and select the optimal model from a curated set of high-quality options. The router considers factors like prompt complexity, task type, and model capabilities.

Usage

Set your model to openrouter/auto:

1 import { OpenRouter } from '@openrouter/sdk';
2 
3 const openRouter = new OpenRouter({
4   apiKey: '<OPENROUTER_API_KEY>',
5 });
6 
7 const completion = await openRouter.chat.send({
8   model: 'openrouter/auto',
9   messages: [
10     {
11       role: 'user',
12       content: 'Explain quantum entanglement in simple terms',
13     },
14   ],
15 });
16 
17 console.log(completion.choices[0].message.content);
18 // Check which model was selected
19 console.log('Model used:', completion.model);

Response

The response includes the model field showing which model was actually used:

1 {
2   "id": "gen-...",
3   "model": "anthropic/claude-sonnet-4.5",  // The model that was selected
4   "choices": [
5     {
6       "message": {
7         "role": "assistant",
8         "content": "..."
9       }
10     }
11   ],
12   "usage": {
13     "prompt_tokens": 15,
14     "completion_tokens": 150,
15     "total_tokens": 165
16   }
17 }

How It Works

Prompt Analysis: Your prompt is analyzed by NotDiamond’s routing system
Model Selection: The optimal model is selected based on the task requirements
Request Forwarding: Your request is forwarded to the selected model
Response Tracking: The response includes metadata showing which model was used

Session Stickiness

The Auto Router pins both the selected model and provider so that subsequent requests in the same conversation route to the same place. This ensures consistent behavior within a conversation and maximizes prompt cache hits.

Stickiness applies at two levels:

Implicit (automatic): OpenRouter derives a conversation fingerprint from your messages (hashing the first system message and first user message). Once the provider reports prompt cache usage, the model and provider are pinned for that conversation. No configuration needed.
Explicit (session_id): When you include a session_id, stickiness kicks in on the first successful response — even before cache usage is observed. This is recommended for multi-turn conversations and agent workflows where you want consistent routing from the start.

In both cases, the cache expires after 5 minutes of inactivity. Each successful request resets the timer. If the cached provider returns an error, the cache is not updated, allowing the next request to be re-routed.

For full details on how sticky routing works, cache key granularity, and the x-session-id header, see Provider Sticky Routing.

Example with `session_id`

1 const completion = await openRouter.chat.send({
2   model: 'openrouter/auto',
3   session_id: 'my-conversation-123',
4   messages: [
5     {
6       role: 'user',
7       content: 'Explain quantum entanglement',
8     },
9   ],
10 });
11 
12 // Subsequent requests with the same session_id will use the same model and provider
13 const followUp = await openRouter.chat.send({
14   model: 'openrouter/auto',
15   session_id: 'my-conversation-123',
16   messages: [
17     { role: 'user', content: 'Explain quantum entanglement' },
18     { role: 'assistant', content: completion.choices[0].message.content ?? '' },
19     { role: 'user', content: 'Now explain it to a 5-year-old' },
20   ],
21 });

Why It Matters for the Auto Router

Unlike using a fixed model, the Auto Router selects a different model each time based on your prompt. Session stickiness is especially important here because it also pins the model selection — not just the provider. Without it, you could get different models on each turn of a conversation, leading to inconsistent behavior and wasted prompt cache.

Supported Models

The Auto Router selects from a curated set of high-quality models including:

Model slugs change as new versions are released. The examples below are current as of December 4, 2025. Check the models page for the latest available models.

Claude Sonnet 4.5 (anthropic/claude-sonnet-4.5)
Claude Opus 4.5 (anthropic/claude-opus-4.5)
GPT-5.1 (openai/gpt-5.1)
Gemini 3.1 Pro (google/gemini-3.1-pro-preview)
DeepSeek 3.2 (deepseek/deepseek-v3.2)
And other top-performing models

The exact model pool may be updated as new models become available.

Configuring Allowed Models

You can restrict which models the Auto Router can select from using the plugins parameter. This is useful when you want to limit routing to specific providers or model families.

Via API Request

Use wildcard patterns to filter models. For example, anthropic/* matches all Anthropic models:

1 const completion = await openRouter.chat.send({
2   model: 'openrouter/auto',
3   messages: [
4     {
5       role: 'user',
6       content: 'Explain quantum entanglement',
7     },
8   ],
9   plugins: [
10     {
11       id: 'auto-router',
12       allowed_models: ['anthropic/*', 'openai/gpt-5.1'],
13     },
14   ],
15 });

Via Settings UI

You can also configure default allowed models in your Plugin Settings:

Navigate to Settings > Plugins
Find Auto Router and click the configure button
Enter model patterns (one per line)
Save your settings

These defaults apply to all your API requests unless overridden per-request.

Pattern Syntax

Pattern	Matches
`anthropic/*`	All Anthropic models
`openai/gpt-5*`	All GPT-5 variants
`google/*`	All Google models
`openai/gpt-5.1`	Exact match only
`/claude-`	Any provider with claude in model name

When no patterns are configured, the Auto Router uses all supported models.

Cost / Quality Tradeoff

Control how aggressively the Auto Router optimizes for cost vs. quality using the cost_quality_tradeoff parameter (integer, 0–10):

0 = pure quality — always picks the most capable model regardless of cost
10 = maximize for cost — cheapest model wins
Intermediate values blend quality and cost signals continuously

The default is 7, which balances cost savings with strong output quality.

Via API Request

1 const completion = await openRouter.chat.send({
2   model: 'openrouter/auto',
3   messages: [
4     {
5       role: 'user',
6       content: 'Summarize this paragraph',
7     },
8   ],
9   plugins: [
10     {
11       id: 'auto-router',
12       cost_quality_tradeoff: 3, // Favor quality over cost
13     },
14   ],
15 });

Via Settings UI

You can also set a default tradeoff in your Plugin Settings under Auto Router. The per-request value overrides this default.

Pricing

You pay the standard rate for whichever model is selected. There is no additional fee for using the Auto Router.

Use Cases

General-purpose applications: When you don’t know what types of prompts users will send
Cost optimization: Let the router choose efficient models for simpler tasks
Quality optimization: Ensure complex prompts get routed to capable models
Experimentation: Discover which models work best for your use case

Limitations

The router requires messages format (not prompt)
Streaming is supported
All standard OpenRouter features (tool calling, etc.) work with the selected model

Body Builder - Generate multiple parallel API requests
Latest Model Resolution - Always target the newest version of a model family
Model Fallbacks - Configure fallback models
Provider Selection - Control which providers are used

Overview

Usage

Response

How It Works

Session Stickiness

Example with session_id

Why It Matters for the Auto Router

Supported Models

Configuring Allowed Models

Via API Request

Via Settings UI

Pattern Syntax

Cost / Quality Tradeoff

Via API Request

Via Settings UI

Pricing

Use Cases

Limitations

Related

Example with `session_id`