Fallback Policies

Fallback Policies automatically retry your requests with different models if one fails, ensuring your application stays reliable even when individual providers have issues.

How It Works

Your request goes to the primary model first
If it fails (timeout, rate limit, error, etc.), the router immediately tries the next model
This continues down the chain until a model successfully responds
Your application receives the successful response without knowing about the failures

Benefits

Higher success rates - No more failed requests due to provider issues
Zero downtime - Automatic failover without code changes
Cost optimization - Start with cheaper models, fall back to premium ones only when needed
No stalled workflows - Your users never see “model unavailable” errors

Creating a Fallback Policy

Step 1: Create the Policy

Go to Routing Policies
Click “Create Policy”
Select “Fallback Chain” as the policy type

Step 2: Configure Your Fallback Chain

Example Setup:

Policy Name: sonnet
Fallback Chain:
- anthropic/claude-sonnet-4-5 (1 retry)
- bedrock/claude-sonnet-4-5@eu-central-1 (1 retry)

Each model can have multiple retries. The router will:

Try anthropic/claude-sonnet-4-5 once
If it fails, retry anthropic/claude-sonnet-4-5 one more time
If still failing, move to bedrock/claude-sonnet-4-5@eu-central-1 and try twice
Continue down the chain until success

Step 3: Use the Policy in Your Code

This is the critical step: You need to change your model parameter to reference your policy. After creating a policy named sonnet, you’ll see it in your models list as:

policy/sonnet

Update your code to use this model identifier:

from openai import OpenAI

client = OpenAI(
    base_url="https://router.requesty.ai/v1",
    api_key="your-requesty-api-key"
)

response = client.chat.completions.create(
    model="policy/sonnet",  # ← Use your policy name here
    messages=[{"role": "user", "content": "Hello!"}]
)

How to find your policy reference:

Go to your Routing Policies
Click the copy button next to your policy name
Paste it directly into your model parameter

Use Cases

Cost-Effective GPT Chain

Start with cheaper models, only use expensive ones if needed:

Policy: cost-effective-gpt
├─ openai/gpt-4o-mini (2 retries)
├─ openai/gpt-4o (1 retry)
└─ openai/gpt-5.2 (1 retry)

Multi-Provider Reliability

Distribute across providers for maximum uptime:

Policy: multi-provider-safe
├─ openai/gpt-5.2 (1 retry)
├─ anthropic/claude-sonnet-4-5 (1 retry)
└─ google/gemini-2.5-pro (1 retry)

Regional Failover

Try regional endpoints before falling back to global:

Policy: regional-claude
├─ bedrock/claude-sonnet-4-5@eu-central-1 (2 retries)
└─ anthropic/claude-sonnet-4-5 (2 retries)

How Retries Work

Each model in the chain can have 0-10 retries. The router uses:

Exponential backoff - Wait time increases between retries (500ms → 1s → 2s → 4s)
Jitter - Random variation (±10%) to prevent thundering herd
Immediate failover - On non-retryable errors (invalid request, auth failure)

Model Compatibility: Make sure all models in your fallback chain support your request parameters (context length, features like streaming, tool calling, etc.). If a model can’t handle the request, the policy will skip to the next model without warning.

Key Selection (BYOK)

For each model, you can choose which API key to use:

Requesty provided key - Use Requesty’s managed keys (default)
My own key - Use your Bring-Your-Own-Key (BYOK) credentials
Try Requesty provided key first, then use my own - Fallback to BYOK if Requesty key fails
Try my own key first, then Requesty’s - Prefer BYOK, fallback to Requesty

Monitoring & Debugging

Track your fallback policy performance:

Go to Analytics
Filter by your policy name
See which models succeeded, failed, and how often fallback occurred

FAQ

What happens if all models in the chain fail?

The request returns an error with details about the last model attempted. You’ll see all the failures in your request logs.

Can I nest policies?

Yes! A fallback policy can reference another policy as one of its fallback options. For example:

Policy A (fallback):
├─ openai/gpt-4
└─ policy/multi-provider-backup  ← Another policy

Do I get charged for failed attempts?

No. You only pay for successful requests that return tokens. Failed attempts don’t incur costs.

How do I update a policy?

Click the edit icon next to your policy in the Routing Policies page. Changes take effect immediately - no code deployment needed.

🚀 Getting Started

🌟 Features

🏢 Enterprise

🔗 Integrations

⚡ Frameworks

📚 API Reference

How It Works

Benefits

Creating a Fallback Policy

Step 1: Create the Policy

Step 2: Configure Your Fallback Chain

Step 3: Use the Policy in Your Code

Use Cases

Cost-Effective GPT Chain

Multi-Provider Reliability

Regional Failover

How Retries Work

Key Selection (BYOK)

Monitoring & Debugging

FAQ

🚀 Getting Started

🌟 Features

🏢 Enterprise

🔗 Integrations

⚡ Frameworks

📚 API Reference

​How It Works

​Benefits

​Creating a Fallback Policy

​Step 1: Create the Policy

​Step 2: Configure Your Fallback Chain

​Step 3: Use the Policy in Your Code

​Use Cases

​Cost-Effective GPT Chain

​Multi-Provider Reliability

​Regional Failover

​How Retries Work

​Key Selection (BYOK)

​Monitoring & Debugging

​FAQ

How It Works

Benefits

Creating a Fallback Policy

Step 1: Create the Policy

Step 2: Configure Your Fallback Chain

Step 3: Use the Policy in Your Code

Use Cases

Cost-Effective GPT Chain

Multi-Provider Reliability

Regional Failover

How Retries Work

Key Selection (BYOK)

Monitoring & Debugging

FAQ