Base URL
Authentication
Include your Requesty API key in the request headers using Anthropicβs standard format:Headers
| Header | Required | Description |
|---|---|---|
x-api-key | β | Your Requesty API key (Anthropic format) |
Content-Type | β | Must be application/json |
anthropic-version | β | API version (defaults to 2023-06-01) |
Example Request
Model Selection
You can use any model available in the Model Library. Examples:- Anthropic Models:
anthropic/claude-sonnet-4-20250514,anthropic/claude-3-7-sonnet - OpenAI Models:
openai/gpt-4o,openai/gpt-4o-mini - Google Models:
google/gemini-2.0-flash-exp - Other Providers:
mistral/mistral-large-2411,meta/llama-3.3-70b-instruct
Streaming
Enable streaming responses by settingstream: true:
Vision Support
Send images using the content blocks format:PDF Support
You can send PDFs, encoded in base 64 format:Tool Use
Define tools that the model can call:System Prompts
Include system instructions using thesystem parameter:
Error Handling
The API returns standard HTTP status codes:200- Success400- Bad Request (invalid parameters)401- Unauthorized (invalid API key)403- Forbidden (insufficient permissions)429- Rate Limited500- Internal Server Error
Response Format
Successful responses follow the Anthropic Messages format:Key Differences from OpenAI Chat Completions
- Authentication: Uses
x-api-keyheader instead ofAuthorization: Bearer - Required
max_tokens: Unlike OpenAIβs API, themax_tokensparameter is required - Content Blocks: Messages use content blocks for rich content (text, images, tool calls)
- System Parameter: System prompts are specified as a separate
systemparameter, not as a message - Role Restrictions: Only
userandassistantroles are supported in messages (nosystemrole)
Headers
Your Requesty API key
The version of the Anthropic API to use
"2023-06-01"
Body
The model to use for the completion
"anthropic/claude-sonnet-4-20250514"
The maximum number of tokens to generate before stopping
x >= 11024
Input messages
System prompt to be used for the completion
Amount of randomness injected into the response
0 <= x <= 2Use nucleus sampling
0 <= x <= 1Only sample from the top K options for each subsequent token
x >= 0Whether to incrementally stream the response using server-sent events
Custom text sequences that will cause the model to stop generating
Definitions of tools that the model may use
How the model should use the provided tools
auto, any Response
Message response
Unique object identifier
Object type
message Conversational role of the generated message
assistant Content generated by the model
- Option 1
- Option 2
- Option 3
- Option 4
The model that handled the request
The reason that we stopped
end_turn, max_tokens, stop_sequence, tool_use Which custom stop sequence was generated