Skip to main content
POST
/
api
/
v1
/
chat
/
completions
curl --request POST \
  --url https://api.crun.ai/api/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Write a short tagline for a coffee brand."
    }
  ],
  "temperature": 0.7
}
'
{
  "id": "chatcmpl_abc123",
  "object": "chat.completion",
  "created": 1772294400,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Bright flavor, smooth finish, and roasted for everyday focus."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 18,
    "completion_tokens": 13,
    "total_tokens": 31
  }
}

API Endpoint

POST https://api.crun.ai/api/v1/chat/completions
This endpoint follows the OpenAI Chat Completions request and response format. It is designed for direct use with the official OpenAI SDKs and other OpenAI-compatible clients.

Authentication

You can authenticate with either:
  • Authorization: Bearer YOUR_API_KEY
  • X-API-KEY: YOUR_API_KEY

Request Examples

curl -X POST "https://api.crun.ai/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Summarize the benefits of serverless functions in 3 bullet points."
      }
    ],
    "temperature": 0.6
  }'

Conversation History Example

CRUN does not manage multi-turn conversation context for your application. To continue a conversation, you must manage prior message history yourself and include the relevant history in each new request.
curl -X POST "https://api.crun.ai/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful support assistant."
      },
      {
        "role": "user",
        "content": "My package has not arrived yet. What should I check first?"
      },
      {
        "role": "assistant",
        "content": "First, check the tracking link, delivery address, and any carrier delay notices."
      },
      {
        "role": "user",
        "content": "The tracking page says delayed by weather. Draft a short reply to the customer."
      }
    ],
    "temperature": 0.6
  }'

Structured Output Example

Use response_format when you need the model to return JSON. For schema-enforced structured outputs, use a model and upstream provider that support json_schema.
curl -X POST "https://api.crun.ai/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [
      {
        "role": "system",
        "content": "Extract tasks as JSON only."
      },
      {
        "role": "user",
        "content": "Maya should send the launch brief by Friday."
      }
    ],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "task",
        "strict": true,
        "schema": {
          "type": "object",
          "properties": {
            "owner": { "type": "string" },
            "task": { "type": "string" },
            "due": { "type": "string" }
          },
          "required": ["owner", "task", "due"],
          "additionalProperties": false
        }
      }
    }
  }'

Tool Calling Example

Chat Completions accepts OpenAI-compatible tools and tool_choice fields. Your application is responsible for executing tool calls and sending tool results back in a follow-up request.
request body
{
  "model": "gpt-5.4",
  "messages": [
    {
      "role": "user",
      "content": "What is the weather like in Paris?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a city.",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {
              "type": "string"
            }
          },
          "required": ["city"],
          "additionalProperties": false
        }
      }
    }
  ],
  "tool_choice": "auto"
}

Vision Input Example

When the selected model supports image input, pass multimodal content parts in the user message.
request body
{
  "model": "gpt-5.4",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Describe this chart in one paragraph."
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://example.com/chart.png"
          }
        }
      ]
    }
  ],
  "max_completion_tokens": 300
}

Streaming Example

Set stream=true to receive Server-Sent Events. CRUN enables usage reporting for streaming requests when possible.
curl -N "https://api.crun.ai/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "stream": true,
    "messages": [
      {
        "role": "user",
        "content": "Write a three-step checklist for reviewing an API integration."
      }
    ],
    "stream_options": {
      "include_usage": true
    }
  }'

Response Examples

{
  "id": "chatcmpl_abc123",
  "object": "chat.completion",
  "created": 1772294400,
  "model": "gpt-5.4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "- No server management.\n- Automatic scaling for bursty traffic.\n- Pay only for actual execution time."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 22,
    "completion_tokens": 19,
    "total_tokens": 41
  }
}

Notes

  • Set stream=true to receive a text/event-stream response.
  • If you omit stream_options.include_usage, CRUN enables it automatically for streaming requests.
  • max_tokens and max_completion_tokens are both accepted and are capped by the selected model’s output token limit.
  • Unknown model IDs return an OpenAI-style error body with code: "model_not_found".
  • For new stateful, tool-heavy, or structured-output workflows, also consider the /responses endpoint. Responses uses input, instructions, and text.format instead of messages and response_format.

Responses API

Build with flexible input, stateful follow-ups, structured outputs, and tool-ready response items.

LLM Quickstart

Learn the base URL, authentication method, and official SDK integration pattern.

Models Overview

Explore all available image, video, audio, and language model APIs.

Pricing

Jump to the pricing page to compare billing across different models.

Authorizations

Authorization
string
header
required

Use your CRUN API key as a Bearer token for OpenAI-compatible SDKs.

Body

application/json

OpenAI-compatible Chat Completions request. Additional compatible fields are accepted and passed through when supported by the upstream model.

model
string
required

Public model ID returned by GET /api/v1/models.

Required string length: 1 - 128
Example:

"gpt-4o-mini"

messages
object[]
required

Conversation messages in OpenAI format.

Minimum array length: 1
temperature
number

Sampling temperature.

Required range: 0 <= x <= 2
Example:

0.7

top_p
number

Nucleus sampling value.

Required range: 0 <= x <= 1
Example:

1

n
integer
default:1

Number of completion choices to generate.

Required range: 1 <= x <= 8
Example:

1

stream
boolean
default:false

Whether to return a Server-Sent Events stream.

Example:

false

stop

Stop sequence or list of stop sequences.

Example:

"###"

max_tokens
integer

Maximum output tokens. Capped by the selected model.

Required range: x >= 1
Example:

512

max_completion_tokens
integer

Maximum completion tokens. Capped by the selected model.

Required range: x >= 1
Example:

512

presence_penalty
number

Presence penalty value.

Required range: -2 <= x <= 2
Example:

0

frequency_penalty
number

Frequency penalty value.

Required range: -2 <= x <= 2
Example:

0

user
string

End-user identifier passed for observability.

Example:

"user_123"

stream_options
object

Streaming options. When stream=true, CRUN enables include_usage=true by default if omitted.

Example:
{ "include_usage": true }
response_format
object

Structured output option in OpenAI-compatible format. Chat Completions uses response_format; Responses uses text.format.

Example:
{ "type": "json_object" }
tools
object[]

Tool definitions in OpenAI-compatible format. Your application executes tool calls and returns tool results in a follow-up request.

tool_choice

Tool selection strategy.

Example:

"auto"

Response

Successful completion response. Returns JSON when stream=false, or SSE when stream=true.

id
string
required
Example:

"chatcmpl_abc123"

object
string
required
Example:

"chat.completion"

created
integer
required

Unix timestamp in seconds.

Example:

1772294400

model
string
required

Public model ID requested by the client.

Example:

"gpt-4o-mini"

choices
object[]
required
usage
object