Crun Responses API
OpenAI-compatible /responses endpoint for CRUN language models, with flexible input, stateful follow-ups, structured outputs, streaming, and tool-ready response items.
API Endpoint
Authentication
You can authenticate with either:Authorization: Bearer YOUR_API_KEYX-API-KEY: YOUR_API_KEY
When to Use Responses
Use/responses when you want:
- Top-level
instructionsinstead of a system message. - A flexible
inputfield that can be a string or a list of message-like items. - Stateful follow-ups with
previous_response_idwhen the upstream model supports stored responses. - Structured outputs through
text.format. - Tool-ready response items and
tools/tool_choicefields.
/chat/completions when you need the classic messages -> choices[] -> message shape for an existing chat
integration.
Basic Text Generation
Message-Style Input
Theinput field can also contain a list of message-like items, which makes migration from Chat Completions
straightforward.
Image Input
When the selected model supports vision, pass images in thecontent array with type: "input_image". You can use a
public image URL, a Base64 data URL for vision.
File Input
For PDFs and other uploaded files supported by the upstream model, upload the file first and pass it as aninput_file
content part.
Stateful Follow-Up
When the upstream model supports stored responses, setstore=true and pass previous_response_id to continue from a
previous response without resending all context.
Structured Output
For Responses, structured output usestext.format. This differs from Chat Completions, where the equivalent field is
response_format.
Tool-Ready Request
OpenAI’s Responses API supports built-in tools and function-style tools. CRUN accepts compatibletools and
tool_choice fields and forwards them when the selected upstream model supports them.
Streaming
Setstream=true to receive Server-Sent Events from /responses. Each event includes an event: line followed by a
JSON data: payload.
Response Examples
Notes
max_output_tokensis capped by the selected model’s output token limit.- Additional OpenAI-compatible fields are accepted and passed through when supported by the upstream model.
- Structured outputs use
text.format; Chat Completions usesresponse_format. - Tool availability depends on the selected model and upstream provider.
- Unknown model IDs return an OpenAI-style error body with
code: "model_not_found".
Related Resources
LLM Quickstart
Chat Completions API
Pricing
Authorizations
Use your CRUN API key as a Bearer token for OpenAI-compatible SDKs.
Body
OpenAI-compatible Responses request. Additional compatible fields are accepted and passed through when supported by the upstream model.
Public model ID returned by GET /api/v1/models.
1 - 128"gpt-5.4"
Model input. Can be a string or a list of response input items; supports text, image, and file content parts depending on the selected model and upstream provider.
"Write a one-sentence launch caption for a noise-canceling headset."
System or developer instructions for the model.
"You are a concise product copywriter."
Whether to return a Server-Sent Events stream.
false
Whether the upstream provider should store the response for stateful follow-ups, when supported.
true
ID of a previous stored response to continue from, when supported.
"resp_abc123"
Developer-defined metadata.
Maximum output tokens. Capped by the selected model.
x >= 1512
Sampling temperature.
0 <= x <= 20.7
Nucleus sampling value.
0 <= x <= 11
End-user identifier passed for observability.
"user_123"
Prompt cache key passed to the upstream provider when supported.
Prompt cache retention policy when supported.
in-memory, 24h Reasoning model configuration.
{ "effort": "medium" }Text output configuration. Use text.format for structured outputs in the Responses API.
{ "format": { "type": "json_object" } }Tool definitions in OpenAI-compatible Responses format.
Tool selection strategy.
"auto"
Whether to allow parallel tool calls when supported.
true
Context truncation strategy when supported.
"auto"
Response
Successful response. Returns JSON when stream=false, or SSE when stream=true.
"resp_abc123"
"response"
Unix timestamp in seconds.
1772294400
"completed"
Public model ID requested by the client.
"gpt-5.4"
Convenience text field when provided by the upstream provider.
"Launch focus anywhere with a headset that quiets distractions and keeps calls crisp."
