Endpoints
Requesty provides two main endpoints:Chat Completions (/v1/chat/completions
)
For generating text completions and conversations with AI models.
Embeddings (/v1/embeddings
)
For creating vector embeddings from text, which can be used for semantic search, similarity matching, and other AI applications.
Chat Completions Request Structure
Your request body to/v1/chat/completions
closely follows the OpenAI Chat Completion schema:
-
Required Fields:
messages
: An array of message objects withrole
andcontent
- Roles can be
user
,assistant
,system
, ortool
model
: The model name. If omitted, defaults to the user’s or payer’s default model. Here is a full list of the supported models
-
Optional Fields:
prompt
: Alternative tomessages
for some providers.stream
: A boolean to enable Server-Sent Events (SSE) streaming responses.max_tokens
,temperature
,top_p
, etc.: Standard language model parameters.tools / functions
: Allows function calling with a schema defined. See OpenAI’s function calling documentation for the structure of these requests.tool_choice
: Specifies how tool calling should be handled.response_format
: For structured responses (some models only).
Example Request Body
get_current_weather
) that the model can call if it decides the user request involves weather data.
Some request fields require a different function, for example if you use response_format
you’ll need to update the request to client.beta.chat.completions.parse
and you may want to use the Pydantic or Zod format for your structure.
Response Structure
The response is normalized to an OpenAI-style ChatCompletion object:- Streaming: If
stream: true
, responses arrive incrementally as SSE events withdata: lines
. See Streaming for documentation on streaming. - Function Calls (Tool Calls): If the model decides to call a tool, it will return a
function_call
in the assistant message. You then execute the tool, append the tool’s result as arole: "tool"
message, and send a follow-up request. The LLM will then integrate the tool output into its final answer.
Non-Streaming Response Example
Embeddings Request Structure
Your request body to/v1/embeddings
follows the OpenAI Embeddings schema:
-
Required Fields:
input
: The text to embed. Can be a string, array of strings, array of tokens, or array of token arraysmodel
: The model name to use for embedding generation (e.g.,openai/text-embedding-3-small
)
-
Optional Fields:
dimensions
: The number of dimensions for the output embeddings (only supported in text-embedding-3 and later models)encoding_format
: The format to return embeddings in (float
orbase64
, defaults tofloat
)user
: A unique identifier representing your end-user