auto_cache
flag that allows you to explicitly control the caching behavior for your requests on supported providers. This gives you finer-grained control over when a request’s response should be cached or retrieved from cache.
How Auto Cache Works
Theauto_cache
flag is a boolean parameter that can be sent within a custom requesty
field in your request payload.
"auto_cache": true
: This will instruct the router to attempt to cache the response from the provider. If a similar request has been cached previously, it might be served from the cache (depending on the provider’s caching strategy and TTL)."auto_cache": false
: This will instruct the router to bypass any automatic caching logic for this specific request and always fetch a fresh response from the provider.- If
auto_cache
is not provided: The router falls back to a default caching behavior which can depend on the origin of the request (e.g., calls from Cline or Roo Code default to caching).
How to Use Auto Cache
To use theauto_cache
flag, include it within the requesty
object in your request.
Example with Auto Cache
This example demonstrates how to set theauto_cache
flag using the OpenAI Python client. The requesty
field is passed as an additional parameter.
Python
Javascript
Important Notes
- Explicit Control:
auto_cache
provides explicit control.true
attempts to cache,false
prevent caching for providers where cache writes incur extra costs. - Default Behavior: If
auto_cache
is not specified in therequesty
field, the caching behavior reverts to defaults. - Provider Support: This flag is respected by providers/models where cache writes incur extra costs, e.g. Anthropic and Gemini.