OpenAI Compatible LLM

Overview

CRUN provides an OpenAI-compatible language model gateway under https://api.crun.ai/api/v1. You can keep using the official OpenAI SDKs, Anthropic SDKs, or any client already compatible with OpenAI-style chat APIs, by only changing the base URL and API key.

Supported compatibility endpoints:

GET /models: list currently available public model IDs
POST /responses: OpenAI’s latest API, recommended endpoint for new stateful, multimodal, and tool-using LLM workflows
POST /messages: Anthropic-compatible endpoint for Claude-style message clients
POST /chat/completions: OpenAI-compatible chat generation endpoint for existing chat clients
POST /completions: legacy text completions endpoint, without stream support in this compatibility layer

Base URL

https://api.crun.ai/api/v1

Authentication

You can authenticate in either of these ways:

Authorization: Bearer YOUR_API_KEY
X-API-KEY: YOUR_API_KEY

For official OpenAI SDKs, we recommend using Authorization: Bearer YOUR_API_KEY.

Keep your API key secure. Never expose it in client-side code or public repositories.

Step 1: List Available Models

The exact model catalog may change over time. Always call /models to retrieve the latest public model IDs before integrating or switching models.

curl -X GET "https://api.crun.ai/api/v1/models" \
  -H "Authorization: Bearer YOUR_API_KEY"

import requests

response = requests.get(
    "https://api.crun.ai/api/v1/models",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    timeout=30,
)

print(response.json())

const response = await fetch("https://api.crun.ai/api/v1/models", {
  headers: {
    Authorization: "Bearer YOUR_API_KEY"
  }
});

console.log(await response.json());

Example Response

{
  "object": "list",
  "data": [
    {
      "id": "gpt-5.4",
      "object": "model",
      "created": 1772294400,
      "owned_by": "OpenAI"
    },
    {
      "id": "claude-sonnet-4-5",
      "object": "model",
      "created": 1772294400,
      "owned_by": "Anthropic"
    },
    {
      "id": "gemini-3.1-pro-preview",
      "object": "model",
      "created": 1772294400,
      "owned_by": "Google"
    }
  ]
}

Step 2: Make Your First Chat Completion

Use /chat/completions when you already have an OpenAI Chat Completions client or want the classic messages and choices response shape. It follows the OpenAI Chat Completions request and response structure, so existing SDK integrations usually only need a base URL change.

curl -X POST "https://api.crun.ai/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [
      {
        "role": "system",
        "content": "You are a concise assistant."
      },
      {
        "role": "user",
        "content": "Write a two-line product description for a wireless keyboard."
      }
    ],
    "temperature": 0.7
  }'

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.crun.ai/api/v1",
)

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[
        {"role": "system", "content": "You are a concise assistant."},
        {"role": "user", "content": "Write a two-line product description for a wireless keyboard."},
    ],
    temperature=0.7,
)

print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.crun.ai/api/v1"
});

const response = await client.chat.completions.create({
  model: "gpt-5.4",
  messages: [
    { role: "system", content: "You are a concise assistant." },
    { role: "user", content: "Write a two-line product description for a wireless keyboard." }
  ],
  temperature: 0.7
});

console.log(response.choices[0].message.content);

Example Response

{
  "id": "chatcmpl_abc123",
  "object": "chat.completion",
  "created": 1772294400,
  "model": "gpt-5.4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Slim, quiet, and built for all-day comfort.\nA reliable wireless keyboard for focused work at home or on the go."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 21,
    "total_tokens": 45
  }
}

Step 3: Use the Responses API

Use /responses for new integrations that benefit from OpenAI’s newer Responses shape: top-level instructions, flexible input, previous_response_id for stateful follow-ups, text.format for structured outputs, and tools when supported by the selected upstream model.

curl -X POST "https://api.crun.ai/api/v1/responses" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "instructions": "You are a concise assistant.",
    "input": "Write a two-line product description for a wireless keyboard.",
    "temperature": 0.7
  }'

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.crun.ai/api/v1",
)

response = client.responses.create(
    model="gpt-5.4",
    instructions="You are a concise assistant.",
    input="Write a two-line product description for a wireless keyboard.",
    temperature=0.7,
)

print(response.output_text)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.crun.ai/api/v1"
});

const response = await client.responses.create({
  model: "gpt-5.4",
  instructions: "You are a concise assistant.",
  input: "Write a two-line product description for a wireless keyboard.",
  temperature: 0.7
});

console.log(response.output_text);

Example Response

{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1772294400,
  "status": "completed",
  "model": "gpt-5.4",
  "output": [
    {
      "id": "msg_abc123",
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Slim, quiet, and built for all-day comfort.\nA reliable wireless keyboard for focused work at home or on the go."
        }
      ]
    }
  ],
  "output_text": "Slim, quiet, and built for all-day comfort.\nA reliable wireless keyboard for focused work at home or on the go.",
  "usage": {
    "input_tokens": 20,
    "output_tokens": 21,
    "total_tokens": 41
  }
}

Managing Conversation History

CRUN does not store or manage conversation state for you. If you want the model to remember previous turns, your application must keep the relevant message history and send it again in each new /chat/completions request.

curl -X POST "https://api.crun.ai/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [
      {
        "role": "system",
        "content": "You are a concise travel assistant."
      },
      {
        "role": "user",
        "content": "I will spend 2 days in Tokyo. Build a simple itinerary."
      },
      {
        "role": "assistant",
        "content": "Day 1: Asakusa and Ueno. Day 2: Shibuya and Meiji Shrine."
      },
      {
        "role": "user",
        "content": "Update it for rainy weather and add indoor options."
      }
    ],
    "temperature": 0.7
  }'

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.crun.ai/api/v1",
)

conversation_history = [
    {"role": "system", "content": "You are a concise travel assistant."},
    {"role": "user", "content": "I will spend 2 days in Tokyo. Build a simple itinerary."},
    {"role": "assistant", "content": "Day 1: Asakusa and Ueno. Day 2: Shibuya and Meiji Shrine."},
]

conversation_history.append(
    {"role": "user", "content": "Update it for rainy weather and add indoor options."}
)

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=conversation_history,
    temperature=0.7,
)

print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.crun.ai/api/v1"
});

const conversationHistory = [
  { role: "system", content: "You are a concise travel assistant." },
  { role: "user", content: "I will spend 2 days in Tokyo. Build a simple itinerary." },
  { role: "assistant", content: "Day 1: Asakusa and Ueno. Day 2: Shibuya and Meiji Shrine." },
  { role: "user", content: "Update it for rainy weather and add indoor options." }
];

const response = await client.chat.completions.create({
  model: "gpt-5.4",
  messages: conversationHistory,
  temperature: 0.7
});

console.log(response.choices[0].message.content);

Streaming

Set stream=true to receive Server-Sent Events from /chat/completions. If stream_options.include_usage is not provided, CRUN enables it automatically so the stream can include usage information near the end.

curl -N "https://api.crun.ai/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "stream": true,
    "messages": [
        {"role": "system", "content": "You are a concise assistant."},
        {"role": "user", "content": "Write a two-line product description for a wireless keyboard."}
    ]
  }'

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.crun.ai/api/v1",
)

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[
        {"role": "system", "content": "You are a concise assistant."},
        {"role": "user", "content": "Write a two-line product description for a wireless keyboard."}
    ],
    stream=True,
    temperature=0.7,
)



for chunk in response:
    if chunk.choices:
        print(chunk.choices[0].delta.content, end="")

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.crun.ai/api/v1",
});

async function main() {
  const stream = await client.chat.completions.create({
    model: "gpt-5.4",
    messages: [
      { role: "system", content: "You are a concise assistant." },
      { role: "user", content: "Write a two-line product description for a wireless keyboard." }
    ],
    temperature: 0.7,
    stream: true,
  });

  for await (const chunk of stream) {
    const content = chunk.choices?.[0]?.delta?.content;
    if (content) {
      process.stdout.write(content);
    }
  }
}

main().catch(console.error);

Example Response

stream response

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":"S"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":"le"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":"ek"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":","}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" responsive"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" wireless"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" keyboard"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" designed"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" for"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" comfortable"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" typing"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" and"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" clutter"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":"-free"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" productivity"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":"."}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":"  \n"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":"Enjoy"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" reliable"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" connectivity"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":","}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" long"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" battery"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" life"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":","}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" and"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" a"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" compact"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" design"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" perfect"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" for"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" home"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" or"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" office"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":" use"}}]}

data: {"id":"","object":"chat.completion.chunk","created":0,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":"."}}]}

data: {"id":"resp_0c628e55cfc393b50069cddfec989c81939cd408ac45285daa","object":"chat.completion.chunk","created":1775099884,"model":"gpt-5.4","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":"stop"}],"usage":{"prompt_tokens":17,"completion_tokens":40,"total_tokens":57,"prompt_tokens_details":{"cached_tokens":0,"audio_tokens":0,"text_tokens":0,"image_tokens":0},"completion_tokens_details":{"reasoning_tokens":0,"audio_tokens":0,"accepted_prediction_tokens":0,"rejected_prediction_tokens":0,"text_tokens":0,"cached_tokens":0}}}

data: [DONE]

Notes

max_tokens and max_completion_tokens are both supported. The actual value is capped by the selected model’s output limit.
max_output_tokens is supported on /responses and is capped by the selected model’s output limit.
Additional OpenAI-compatible fields are accepted and passed through when supported by the upstream model.
Billing is based on input and output token usage, with a model-specific minimum charge. The minimum value is usually 0.01.
For new agentic or stateful work, prefer /responses. For existing chat clients, /chat/completions remains supported. The legacy /completions endpoint is only provided for compatibility and does not support stream=true.

Responses API

Build with flexible input, stateful follow-ups, structured outputs, and tool-ready response items.

Messages API

Use Anthropic-compatible messages, system prompts, tools, thinking, and streaming events.

Chat Completions API

Full API reference for request parameters, streaming, and OpenAI-style error responses.

Account Credits

Check your remaining credits before running large-volume language model workloads.

Pricing

Jump to the pricing page to compare billing across different models.

Introduction

Language Models

Image Models

Video Models

Audio Models

Task

OpenAI Compatible LLM

Overview

Base URL

Authentication

Step 1: List Available Models

Example Response

Step 2: Make Your First Chat Completion

Example Response

Step 3: Use the Responses API

Example Response

Managing Conversation History

Streaming

Example Response

Notes

Responses API

Messages API

Chat Completions API

Account Credits

Pricing

​Overview

​Base URL

​Authentication

​Step 1: List Available Models

​Example Response

​Step 2: Make Your First Chat Completion

​Example Response

​Step 3: Use the Responses API

​Example Response

​Managing Conversation History

​Streaming

​Example Response

​Notes

​Related Resources

Responses API

Messages API

Chat Completions API

Account Credits

Pricing

Overview

Base URL

Authentication

Step 1: List Available Models

Example Response

Step 2: Make Your First Chat Completion

Example Response

Step 3: Use the Responses API

Example Response

Managing Conversation History

Streaming

Example Response

Notes

Related Resources