Chat completions

The Chat Completions endpoint is the primary way to generate text responses. Send a list of messages, get a model-generated response back.

Endpoint

POST https://api.routic.ai/v1/chat/completions
Content-Type: application/json
Authorization: Bearer sk-xxxxxxxx

Replace the domain with your assigned Base URL.

Request body

Required parameters

Parameter	Type	Description
`model`	string	The model name or enabled routing alias (e.g., `deepseek-r1`, `auto/reasoning`). See the Model catalog.
`messages`	array	A list of messages comprising the conversation. Each message must have a `role` (`system`, `user`, or `assistant`) and `content` (string).

Optional parameters

Parameter	Type	Default	Range	Description
`stream`	boolean	`false`	-	If `true`, the response is streamed as server-sent events.
`temperature`	number	`1.0`	0–2	Sampling temperature. Lower values make output more predictable; `0` for most predictable, `2` for most random.
`max_tokens`	integer	Model default	-	Maximum number of tokens to generate.
`top_p`	number	`1.0`	0–1	Nucleus sampling probability threshold.
`frequency_penalty`	number	`0`	-2–2	Penalize tokens based on their frequency in the text so far.
`presence_penalty`	number	`0`	-2–2	Penalize tokens based on whether they appear in the text so far.
`stop`	array / string / null	`null`	-	Up to 4 sequences where the API will stop generating further tokens.
`tools`	array	`null`	-	A list of tool definitions for function calling. See Tool calls.
`tool_choice`	string / object	`"none"`	-	Controls which (if any) tool is called. `"none"` = no tool, `"auto"` = model decides, or specify a tool.
`response_format`	object	`{ "type": "text" }`	-	Set `{ "type": "json_object" }` to enable JSON output mode. See JSON output.
`thinking`	object	`null`	-	Set `{ "type": "enabled" }` to enable Thinking Mode on reasoning models. See Thinking Mode.
`logprobs`	boolean	`false`	-	Whether to return log probabilities of the output tokens.
`top_logprobs`	integer	`null`	0–20	Number of most likely tokens to return probabilities for at each token position.
`stream_options`	object	`null`	-	Options for streaming response. Only set when `stream: true`.

Messages format

Each message in the messages array must have:

Field	Type	Required	Description
`role`	string	Yes	One of `system`, `user`, or `assistant`.
`content`	string	Yes	The text content of the message.

System message example (optional, guides behavior):

{ "role": "system", "content": "You are a helpful coding assistant." }

User message example:

{ "role": "user", "content": "Explain how JWT authentication works." }

Assistant message example (from previous responses):

{ "role": "assistant", "content": "JWT authentication works by..." }

Response body (non-streaming)

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1713593400,
  "model": "deepseek-r1",
  "system_fingerprint": "fp_xxxxx",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Here's a detailed explanation...",
        "reasoning_content": "..." // Present when thinking mode is enabled on reasoning models
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 120,
    "total_tokens": 145,
    "prompt_cache_hit_tokens": 0,
    "prompt_cache_miss_tokens": 25
  }
}

Response fields

Field	Type	Description
`id`	string	Unique identifier for this completion.
`object`	string	Always `"chat.completion"`.
`created`	integer	Unix timestamp of when the completion was created.
`model`	string	The model that generated the response.
`system_fingerprint`	string	Represents the backend configuration fingerprint.
`choices`	array	List of completion choices. Usually contains one item.
`choices[].index`	integer	Index of the choice in the list.
`choices[].message`	object	The generated message with `role` and `content`.
`choices[].message.reasoning_content`	string	Chain-of-thought content (only when thinking mode is enabled).
`choices[].finish_reason`	string	Reason the model stopped: `"stop"`, `"length"`, `"tool_calls"`, or `"content_filter"`.
`usage`	object	Token usage statistics.
`usage.prompt_tokens`	integer	Number of tokens in the prompt.
`usage.completion_tokens`	integer	Number of tokens in the generated response.
`usage.total_tokens`	integer	Total tokens (prompt + completion).
`usage.prompt_cache_hit_tokens`	integer	Tokens served from cache (billed at lower rate).
`usage.prompt_cache_miss_tokens`	integer	Tokens not served from cache (billed at normal rate).

Streaming response

When stream: true, the server sends events as the model generates tokens. Each event is a text/event-stream message with the following format:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1713593400,"model":"deepseek-r1","choices":[{"index":0,"delta":{"role":"assistant","content":"Here"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1713593400,"model":"deepseek-r1","choices":[{"index":0,"delta":{"content":"'s"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1713593400,"model":"deepseek-r1","choices":[{"index":0,"delta":{"content":" a"},"finish_reason":null}]}

data: [DONE]

Streaming delta format

Each chunk contains:

Field	Description
`delta.role`	The role of the message (only in the first chunk).
`delta.content`	The incremental text generated. Concatenate all chunks to get the full response.
`delta.reasoning_content`	Incremental thinking content (when thinking mode is enabled).
`finish_reason`	`null` while generating, then `"stop"` / `"length"` / `"tool_calls"` on the final chunk.

The stream ends with data: [DONE].

Code examples

cURL (basic)

curl -X POST "https://api.routic.ai/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxx" \
  -d '{
    "model": "deepseek-r1",
    "messages": [
      { "role": "user", "content": "Explain the difference between REST and GraphQL." }
    ],
    "temperature": 0.7,
    "max_tokens": 500
  }'

cURL (streaming)

curl -X POST "https://api.routic.ai/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxx" \
  -d '{
    "model": "deepseek-v3",
    "messages": [
      { "role": "user", "content": "Write a haiku about coding." }
    ],
    "stream": true
  }'

cURL (smart routing name)

curl -X POST "https://api.routic.ai/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxx" \
  -d '{
    "model": "auto/reasoning",
    "messages": [
      { "role": "user", "content": "Analyze this business requirement and suggest next steps." }
    ]
  }'

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.routic.ai/v1",
    api_key="sk-xxxxxxxx",
)

response = client.chat.completions.create(
    model="deepseek-r1",
    messages=[
        {"role": "user", "content": "Explain the difference between REST and GraphQL."}
    ],
    temperature=0.7,
    max_tokens=500,
)

print(response.choices[0].message.content)

Node.js (OpenAI SDK)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.routic.ai/v1",
  apiKey: "sk-xxxxxxxx",
});

const response = await client.chat.completions.create({
  model: "deepseek-r1",
  messages: [
    {
      role: "user",
      content: "Explain the difference between REST and GraphQL.",
    },
  ],
  temperature: 0.7,
  max_tokens: 500,
});

console.log(response.choices[0].message.content);

Two calling styles

Style 1: Canonical model name (recommended)

Use the industry-standard model name for precise control:

{ "model": "deepseek-r1", "messages": [...] }

Style 2: Smart routing name

Use a Routic-managed routing identifier — Routic automatically picks the best model for that capability:

{ "model": "auto/reasoning", "messages": [...] }

Both work on the same endpoint. See the Model catalog for details.

Migration from other platforms

If you're already using OpenAI, OpenRouter, or another compatible gateway, migration requires only 3 steps:

Change base_url to your Routic endpoint.
Change the API key to your Routic API Key.
Update the model parameter to a Routic-supported canonical name.

No changes to request structure, message format, or response parsing are required.