Overview
CRUN provides an OpenAI-compatible language model gateway underhttps://api.crun.ai/api/v1.
You can keep using the official OpenAI SDKs, Anthropic SDKs, or any client already compatible with OpenAI-style chat APIs, by only changing the base URL and API key.
Supported compatibility endpoints:
GET /models: list currently available public model IDsPOST /responses: OpenAI’s latest API, recommended endpoint for new stateful, multimodal, and tool-using LLM workflowsPOST /messages: Anthropic-compatible endpoint for Claude-style message clientsPOST /chat/completions: OpenAI-compatible chat generation endpoint for existing chat clientsPOST /completions: legacy text completions endpoint, without stream support in this compatibility layer
Base URL
Authentication
You can authenticate in either of these ways:Authorization: Bearer YOUR_API_KEYX-API-KEY: YOUR_API_KEY
Authorization: Bearer YOUR_API_KEY.
Step 1: List Available Models
The exact model catalog may change over time. Always call/models to retrieve the latest public model IDs before integrating or switching models.
Example Response
Step 2: Make Your First Chat Completion
Use/chat/completions when you already have an OpenAI Chat Completions client or want the classic messages and choices response shape. It follows the OpenAI Chat Completions request and response structure, so existing SDK integrations usually only need a base URL change.
Example Response
Step 3: Use the Responses API
Use/responses for new integrations that benefit from OpenAI’s newer Responses shape: top-level instructions, flexible input, previous_response_id for stateful follow-ups, text.format for structured outputs, and tools when supported by the selected upstream model.
Example Response
Managing Conversation History
CRUN does not store or manage conversation state for you. If you want the model to remember previous turns, your application must keep the relevant message history and send it again in each new/chat/completions request.
Streaming
Setstream=true to receive Server-Sent Events from /chat/completions.
If stream_options.include_usage is not provided, CRUN enables it automatically so the stream can include usage information near the end.
Example Response
stream response
Notes
max_tokensandmax_completion_tokensare both supported. The actual value is capped by the selected model’s output limit.max_output_tokensis supported on/responsesand is capped by the selected model’s output limit.- Additional OpenAI-compatible fields are accepted and passed through when supported by the upstream model.
- Billing is based on input and output token usage, with a model-specific minimum charge. The minimum value is usually 0.01.
- For new agentic or stateful work, prefer
/responses. For existing chat clients,/chat/completionsremains supported. The legacy/completionsendpoint is only provided for compatibility and does not supportstream=true.
Related Resources
Responses API
Build with flexible input, stateful follow-ups, structured outputs, and tool-ready response items.
Messages API
Use Anthropic-compatible messages, system prompts, tools, thinking, and streaming events.
Chat Completions API
Full API reference for request parameters, streaming, and OpenAI-style error responses.
Account Credits
Check your remaining credits before running large-volume language model workloads.
Pricing
Jump to the pricing page to compare billing across different models.
