How It Works
- Your request goes to the primary model first
- If it fails (timeout, rate limit, error, etc.), the router immediately tries the next model
- This continues down the chain until a model successfully responds
- Your application receives the successful response without knowing about the failures
Benefits
- Higher success rates - No more failed requests due to provider issues
- Zero downtime - Automatic failover without code changes
- Cost optimization - Start with cheaper models, fall back to premium ones only when needed
- No stalled workflows - Your users never see “model unavailable” errors
Creating a Fallback Policy
Step 1: Create the Policy
- Go to Routing Policies
- Click “Create Policy”
- Select “Fallback Chain” as the policy type

Step 2: Configure Your Fallback Chain
Example Setup:- Policy Name:
sonnet - Fallback Chain:
anthropic/claude-sonnet-4-5(1 retry)bedrock/claude-sonnet-4-5@eu-central-1(1 retry)
- Try
anthropic/claude-sonnet-4-5once - If it fails, retry
anthropic/claude-sonnet-4-5one more time - If still failing, move to
bedrock/claude-sonnet-4-5@eu-central-1and try twice - Continue down the chain until success
Step 3: Use the Policy in Your Code
This is the critical step: You need to change yourmodel parameter to reference your policy.
After creating a policy named sonnet, you’ll see it in your models list as:
How to find your policy reference:
- Go to your Routing Policies
- Click the copy button next to your policy name
- Paste it directly into your
modelparameter
Use Cases
Cost-Effective GPT Chain
Start with cheaper models, only use expensive ones if needed:Multi-Provider Reliability
Distribute across providers for maximum uptime:Regional Failover
Try regional endpoints before falling back to global:How Retries Work
Each model in the chain can have 0-10 retries. The router uses:- Exponential backoff - Wait time increases between retries (500ms → 1s → 2s → 4s)
- Jitter - Random variation (±10%) to prevent thundering herd
- Immediate failover - On non-retryable errors (invalid request, auth failure)
Key Selection (BYOK)
For each model, you can choose which API key to use:- Requesty provided key - Use Requesty’s managed keys (default)
- My own key - Use your Bring-Your-Own-Key (BYOK) credentials
- Try Requesty provided key first, then use my own - Fallback to BYOK if Requesty key fails
- Try my own key first, then Requesty’s - Prefer BYOK, fallback to Requesty
Monitoring & Debugging
Track your fallback policy performance:- Go to Analytics
- Filter by your policy name
- See which models succeeded, failed, and how often fallback occurred
FAQ
What happens if all models in the chain fail?
What happens if all models in the chain fail?
The request returns an error with details about the last model attempted. You’ll see all the failures in your request logs.
Can I nest policies?
Can I nest policies?
Yes! A fallback policy can reference another policy as one of its fallback options. For example:
Do I get charged for failed attempts?
Do I get charged for failed attempts?
No. You only pay for successful requests that return tokens. Failed attempts don’t incur costs.
How do I update a policy?
How do I update a policy?
Click the edit icon next to your policy in the Routing Policies page. Changes take effect immediately - no code deployment needed.