Chat completions
The Chat Completions endpoint is the primary way to generate text responses. Send a list of messages, get a model-generated response back.
Endpoint
POST https://api.routic.ai/v1/chat/completions
Content-Type: application/json
Authorization: Bearer sk-xxxxxxxx
Replace the domain with your assigned Base URL.
Request body
Required parameters
| Parameter | Type | Description |
|---|---|---|
model | string | The model name or enabled routing alias (e.g., deepseek-r1, auto/reasoning). See the Model catalog. |
messages | array | A list of messages comprising the conversation. Each message must have a role (system, user, or assistant) and content (string). |
Optional parameters
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
stream | boolean | false | - | If true, the response is streamed as server-sent events. |
temperature | number | 1.0 | 0–2 | Sampling temperature. Lower values make output more predictable; 0 for most predictable, 2 for most random. |
max_tokens | integer | Model default | - | Maximum number of tokens to generate. |
top_p | number | 1.0 | 0–1 | Nucleus sampling probability threshold. |
frequency_penalty | number | 0 | -2–2 | Penalize tokens based on their frequency in the text so far. |
presence_penalty | number | 0 | -2–2 | Penalize tokens based on whether they appear in the text so far. |
stop | array / string / null | null | - | Up to 4 sequences where the API will stop generating further tokens. |
tools | array | null | - | A list of tool definitions for function calling. See Tool calls. |
tool_choice | string / object | "none" | - | Controls which (if any) tool is called. "none" = no tool, "auto" = model decides, or specify a tool. |
response_format | object | { "type": "text" } | - | Set { "type": "json_object" } to enable JSON output mode. See JSON output. |
thinking | object | null | - | Set { "type": "enabled" } to enable Thinking Mode on reasoning models. See Thinking Mode. |
logprobs | boolean | false | - | Whether to return log probabilities of the output tokens. |
top_logprobs | integer | null | 0–20 | Number of most likely tokens to return probabilities for at each token position. |
stream_options | object | null | - | Options for streaming response. Only set when stream: true. |
Messages format
Each message in the messages array must have:
| Field | Type | Required | Description |
|---|---|---|---|
role | string | Yes | One of system, user, or assistant. |
content | string | Yes | The text content of the message. |
System message example (optional, guides behavior):
{ "role": "system", "content": "You are a helpful coding assistant." }
User message example:
{ "role": "user", "content": "Explain how JWT authentication works." }
Assistant message example (from previous responses):
{ "role": "assistant", "content": "JWT authentication works by..." }
Response body (non-streaming)
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1713593400,
"model": "deepseek-r1",
"system_fingerprint": "fp_xxxxx",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Here's a detailed explanation...",
"reasoning_content": "..." // Present when thinking mode is enabled on reasoning models
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 120,
"total_tokens": 145,
"prompt_cache_hit_tokens": 0,
"prompt_cache_miss_tokens": 25
}
}
Response fields
| Field | Type | Description |
|---|---|---|
id | string | Unique identifier for this completion. |
object | string | Always "chat.completion". |
created | integer | Unix timestamp of when the completion was created. |
model | string | The model that generated the response. |
system_fingerprint | string | Represents the backend configuration fingerprint. |
choices | array | List of completion choices. Usually contains one item. |
choices[].index | integer | Index of the choice in the list. |
choices[].message | object | The generated message with role and content. |
choices[].message.reasoning_content | string | Chain-of-thought content (only when thinking mode is enabled). |
choices[].finish_reason | string | Reason the model stopped: "stop", "length", "tool_calls", or "content_filter". |
usage | object | Token usage statistics. |
usage.prompt_tokens | integer | Number of tokens in the prompt. |
usage.completion_tokens | integer | Number of tokens in the generated response. |
usage.total_tokens | integer | Total tokens (prompt + completion). |
usage.prompt_cache_hit_tokens | integer | Tokens served from cache (billed at lower rate). |
usage.prompt_cache_miss_tokens | integer | Tokens not served from cache (billed at normal rate). |
Streaming response
When stream: true, the server sends events as the model generates tokens. Each event is a text/event-stream message with the following format:
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1713593400,"model":"deepseek-r1","choices":[{"index":0,"delta":{"role":"assistant","content":"Here"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1713593400,"model":"deepseek-r1","choices":[{"index":0,"delta":{"content":"'s"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1713593400,"model":"deepseek-r1","choices":[{"index":0,"delta":{"content":" a"},"finish_reason":null}]}
data: [DONE]
Streaming delta format
Each chunk contains:
| Field | Description |
|---|---|
delta.role | The role of the message (only in the first chunk). |
delta.content | The incremental text generated. Concatenate all chunks to get the full response. |
delta.reasoning_content | Incremental thinking content (when thinking mode is enabled). |
finish_reason | null while generating, then "stop" / "length" / "tool_calls" on the final chunk. |
The stream ends with data: [DONE].
Code examples
cURL (basic)
curl -X POST "https://api.routic.ai/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxxxxxxx" \
-d '{
"model": "deepseek-r1",
"messages": [
{ "role": "user", "content": "Explain the difference between REST and GraphQL." }
],
"temperature": 0.7,
"max_tokens": 500
}'
cURL (streaming)
curl -X POST "https://api.routic.ai/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxxxxxxx" \
-d '{
"model": "deepseek-v3",
"messages": [
{ "role": "user", "content": "Write a haiku about coding." }
],
"stream": true
}'
cURL (smart routing name)
curl -X POST "https://api.routic.ai/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxxxxxxx" \
-d '{
"model": "auto/reasoning",
"messages": [
{ "role": "user", "content": "Analyze this business requirement and suggest next steps." }
]
}'
Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
base_url="https://api.routic.ai/v1",
api_key="sk-xxxxxxxx",
)
response = client.chat.completions.create(
model="deepseek-r1",
messages=[
{"role": "user", "content": "Explain the difference between REST and GraphQL."}
],
temperature=0.7,
max_tokens=500,
)
print(response.choices[0].message.content)
Node.js (OpenAI SDK)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.routic.ai/v1",
apiKey: "sk-xxxxxxxx",
});
const response = await client.chat.completions.create({
model: "deepseek-r1",
messages: [
{
role: "user",
content: "Explain the difference between REST and GraphQL.",
},
],
temperature: 0.7,
max_tokens: 500,
});
console.log(response.choices[0].message.content);
Two calling styles
Style 1: Canonical model name (recommended)
Use the industry-standard model name for precise control:
{ "model": "deepseek-r1", "messages": [...] }
Style 2: Smart routing name
Use a Routic-managed routing identifier — Routic automatically picks the best model for that capability:
{ "model": "auto/reasoning", "messages": [...] }
Both work on the same endpoint. See the Model catalog for details.
Migration from other platforms
If you're already using OpenAI, OpenRouter, or another compatible gateway, migration requires only 3 steps:
- Change
base_urlto your Routic endpoint. - Change the API key to your Routic API Key.
- Update the
modelparameter to a Routic-supported canonical name.
No changes to request structure, message format, or response parsing are required.