Get guaranteed, schema-valid JSON from 300+ LLMs with a single API
Requesty makes every supported model speak structured JSON — from simple json_object mode to strict, schema-enforced json_schema mode. One API, consistent behavior, regardless of the underlying provider.
OpenAI, Anthropic, Google, and more — check supports_json_schema in List Models
Use json_schema whenever you need type-safe, parseable output. It eliminates the need for retry loops and manual validation — the model is constrained at the decoding level to only produce tokens that satisfy your schema.
JSON Schema mode gives you guaranteed structured output. You define the exact shape of the response, and the model is constrained to produce only valid output conforming to that schema.
from openai import OpenAIfrom pydantic import BaseModelclient = OpenAI( api_key="YOUR_REQUESTY_API_KEY", base_url="https://router.requesty.ai/v1",)# Define your schema as a Pydantic modelclass CalendarEvent(BaseModel): name: str date: str participants: list[str]# Use the SDK's built-in parsing — it sends the schema# and parses the response in one stepcompletion = client.beta.chat.completions.parse( model="openai/gpt-4.1", messages=[ {"role": "system", "content": "Extract the event information."}, {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."}, ], response_format=CalendarEvent,)event = completion.choices[0].message.parsedprint(event.name) # "Science Fair"print(event.date) # "Friday"print(event.participants) # ["Alice", "Bob"]
For models that don’t support json_schema, or when you just need valid JSON without strict schema enforcement, use json_object mode. The model will return valid JSON, but you’re responsible for instructing it on the desired structure via your prompt.
Chat Completions
Responses API
from openai import OpenAIclient = OpenAI( api_key="YOUR_REQUESTY_API_KEY", base_url="https://router.requesty.ai/v1",)completion = client.chat.completions.create( model="openai/gpt-4.1", messages=[ { "role": "system", "content": ( "Extract entities from the text. Return JSON with this structure: " '{"people": [...], "places": [...], "dates": [...]}' ), }, { "role": "user", "content": "John met Sarah in Paris on January 5th.", }, ], response_format={"type": "json_object"},)import jsondata = json.loads(completion.choices[0].message.content)print(data)# {"people": ["John", "Sarah"], "places": ["Paris"], "dates": ["January 5th"]}
from openai import OpenAIclient = OpenAI( api_key="YOUR_REQUESTY_API_KEY", base_url="https://router.requesty.ai/v1",)response = client.responses.create( model="openai/gpt-4.1", instructions=( "Extract entities from the text. Return JSON with this structure: " '{"people": [...], "places": [...], "dates": [...]}' ), input="John met Sarah in Paris on January 5th.", text={"format": {"type": "json_object"}},)import jsondata = json.loads(response.output_text)print(data)# {"people": ["John", "Sarah"], "places": ["Paris"], "dates": ["January 5th"]}
With json_object mode, you must instruct the model to produce JSON in your system or user message. The model is only guaranteed to return valid JSON — not any particular structure. Use json_schema mode for guaranteed structure.
from openai import OpenAIfrom pydantic import BaseModelclient = OpenAI( api_key="YOUR_REQUESTY_API_KEY", base_url="https://router.requesty.ai/v1",)class SentimentAnalysis(BaseModel): sentiment: str # "positive", "negative", "neutral" confidence: float key_phrases: list[str] summary: strcompletion = client.beta.chat.completions.parse( model="anthropic/claude-sonnet-4-5", messages=[ { "role": "system", "content": "Analyze the sentiment of the given text. Be precise with confidence scores.", }, { "role": "user", "content": "The new product launch exceeded all expectations. Customer feedback has been overwhelmingly positive, though some users reported minor issues with the onboarding flow.", }, ], response_format=SentimentAnalysis,)result = completion.choices[0].message.parsedprint(f"Sentiment: {result.sentiment} ({result.confidence:.0%})")print(f"Key phrases: {result.key_phrases}")
from openai import OpenAIfrom pydantic import BaseModelclient = OpenAI( api_key="YOUR_REQUESTY_API_KEY", base_url="https://router.requesty.ai/v1",)class MathSolution(BaseModel): steps: list[str] final_answer: floatcompletion = client.beta.chat.completions.parse( model="openai/gpt-4.1", messages=[ { "role": "system", "content": "Solve the math problem step by step. Show your work in the steps array.", }, { "role": "user", "content": "A store has a 25% off sale. If a jacket originally costs $80, and there's an additional 10% member discount applied after the sale price, what's the final price?", }, ], response_format=MathSolution,)solution = completion.choices[0].message.parsedfor i, step in enumerate(solution.steps, 1): print(f"Step {i}: {step}")print(f"Answer: ${solution.final_answer}")
Requesty’s structured outputs work seamlessly with popular frameworks. Since Requesty is OpenAI-compatible, just point the framework’s OpenAI client at https://router.requesty.ai/v1.
LangChain
Instructor
Pydantic AI
LiteLLM
DSPy
Haystack
from langchain_openai import ChatOpenAIfrom langchain_core.messages import SystemMessage, HumanMessagefrom pydantic import BaseModelclass Answer(BaseModel): championships: int summary: strllm = ChatOpenAI( model="anthropic/claude-sonnet-4-5", api_key="YOUR_REQUESTY_API_KEY", base_url="https://router.requesty.ai/v1",)# .with_structured_output() sends json_schema under the hoodstructured_llm = llm.with_structured_output(Answer)result = structured_llm.invoke([ SystemMessage(content="Answer questions about NBA players."), HumanMessage(content="How many championships has LeBron James won?"),])print(f"{result.championships} championships")
import instructorfrom openai import OpenAIfrom pydantic import BaseModelclient = instructor.from_openai( OpenAI( api_key="YOUR_REQUESTY_API_KEY", base_url="https://router.requesty.ai/v1", ))class UserInfo(BaseModel): name: str age: int email: struser = client.chat.completions.create( model="openai/gpt-4.1", messages=[ {"role": "user", "content": "John Doe is 30 years old. His email is john@example.com"}, ], response_model=UserInfo,)print(f"{user.name}, {user.age}, {user.email}")
from pydantic_ai import Agentfrom pydantic_ai.models.openai import OpenAIChatModelfrom pydantic_ai.providers.openai import OpenAIProviderfrom pydantic import BaseModelclass CityInfo(BaseModel): name: str country: str population: int known_for: list[str]provider = OpenAIProvider( api_key="YOUR_REQUESTY_API_KEY", base_url="https://router.requesty.ai/v1",)model = OpenAIChatModel("anthropic/claude-sonnet-4-5", provider=provider)agent = Agent( model, system_prompt="Provide information about cities.", output_type=CityInfo,)result = agent.run_sync("Tell me about Tokyo")print(f"{result.output.name}, {result.output.country}")print(f"Known for: {', '.join(result.output.known_for)}")
import litellmimport jsonresponse = litellm.completion( model="openai/anthropic/claude-sonnet-4-5", messages=[ {"role": "system", "content": "Extract the key entities."}, {"role": "user", "content": "Apple Inc. was founded by Steve Jobs in Cupertino."}, ], response_format={ "type": "json_schema", "json_schema": { "name": "Entities", "strict": True, "schema": { "type": "object", "properties": { "companies": {"type": "array", "items": {"type": "string"}}, "people": {"type": "array", "items": {"type": "string"}}, "locations": {"type": "array", "items": {"type": "string"}} }, "required": ["companies", "people", "locations"], "additionalProperties": False } } }, api_key="YOUR_REQUESTY_API_KEY", api_base="https://router.requesty.ai/v1",)print(json.loads(response.choices[0].message.content))
import dspylm = dspy.LM( model="openai/anthropic/claude-sonnet-4-5", api_key="YOUR_REQUESTY_API_KEY", api_base="https://router.requesty.ai/v1",)class ExtractInfo(dspy.Signature): """Extract structured information from text.""" text: str = dspy.InputField() companies: list[str] = dspy.OutputField(desc="Company names mentioned") people: list[str] = dspy.OutputField(desc="People mentioned") locations: list[str] = dspy.OutputField(desc="Locations mentioned")with dspy.context(lm=lm): predict = dspy.Predict(ExtractInfo) result = predict(text="Sundar Pichai leads Google from their Mountain View office.") print(result.companies) # ["Google"] print(result.people) # ["Sundar Pichai"]
from haystack.components.generators.chat import OpenAIChatGeneratorfrom haystack.utils import Secretfrom haystack.dataclasses import ChatMessageimport jsongenerator = OpenAIChatGenerator( model="vertex/gemini-2.5-flash", api_key=Secret.from_token("YOUR_REQUESTY_API_KEY"), api_base_url="https://router.requesty.ai/v1", generation_kwargs={ "response_format": { "type": "json_schema", "json_schema": { "name": "Analysis", "strict": True, "schema": { "type": "object", "properties": { "topic": {"type": "string"}, "sentiment": {"type": "string"}, "confidence": {"type": "number"} }, "required": ["topic", "sentiment", "confidence"], "additionalProperties": False } } } },)generator.warm_up()result = generator.run(messages=[ ChatMessage.from_system("Analyze the topic and sentiment of the text."), ChatMessage.from_user("I absolutely love the new Python 3.13 release!"),])analysis = json.loads(result["replies"][0].text)print(f"Topic: {analysis['topic']}, Sentiment: {analysis['sentiment']}")
Both json_object and json_schema work with streaming. Tokens are delivered incrementally, and you can parse the complete JSON once the stream finishes.
Simpler schemas produce more reliable outputs. Flatten deeply nested structures where possible and limit arrays to 10–20 items in your schema descriptions.
Use descriptions for guidance
Add description fields to your schema properties to guide the model on what to extract:
{ "type": "object", "properties": { "confidence": { "type": "number", "description": "A confidence score between 0.0 and 1.0" } }}
Handle optional fields with null unions
Since strict mode requires all properties in required, use null unions for optional values:
class UserProfile(BaseModel): name: str email: str phone: str | None # Will be null if not found
Combine with system prompts
The schema enforces structure, but your system prompt controls the content quality. Be specific about what you want extracted and how.
Use fallback policies for reliability
Combine structured outputs with fallback policies to automatically retry with a different model if the primary one fails: