Observability & request IDs

Every request to the Routic API is assigned a unique request ID. Understanding how to use it helps you debug issues faster and communicate effectively with support.

What is a request ID?

A request ID is a unique string that identifies a single API request as it flows through the system. It connects your client-side error with the server-side logs and billing records.

Format: req_ followed by 32 hexadecimal characters (e.g., req_a1b2c3d4e5f6789012345678abcdef01).

How to get a request ID

From the response header

Every API response includes the X-Request-Id header:

HTTP/1.1 200 OK
X-Request-Id: req_a1b2c3d4e5f6789012345678abcdef01
Content-Type: application/json

From the error payload

Error responses include request_id in the JSON body:

{
  "error": {
    "message": "RPM limit exceeded. Try again in 5 seconds.",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded",
    "request_id": "req_a1b2c3d4e5f6789012345678abcdef01"
  }
}

Provide your own request ID

You can pass a X-Request-Id header in your request. Routic will use your value and echo it back in the response. This is useful for correlating your internal tracing with Routic's logs.

import openai

client = openai.OpenAI(
    base_url="https://api.routic.ai/v1",
    api_key="sk-xxxxxxxx",
    default_headers={"X-Request-Id": "my-trace-12345"},
)

If you don't provide one, Routic generates one automatically.

How request IDs flow through the system

Your app → Routic API → Gateway → Model provider
   │            │           │           │
   └─ X-Request-Id ─┴── logs ──┴── spend log ──┘
                                    │
                              usage_records
                          (gateway_request_id)

Client sends the request (with or without X-Request-Id)
Routic API assigns or preserves the ID, logs it with the access log
Gateway processes the request and records the ID in spend logs
Usage sync writes the ID as gateway_request_id into usage records
Billing links usage records to your account via this ID

This means the same request ID can be used to trace:

API access logs (method, path, status, duration)
Gateway spend logs (model, tokens, cost)
Your billing records (usage, charges)

Debugging with request IDs

When to use a request ID

Scenario	What to do
Unexpected error	Copy the `request_id` from the error response and share with support.
Missing usage	Provide the `request_id` and approximate time — support can trace whether it was recorded.
Billing discrepancy	Share the `request_id` so support can compare gateway cost vs. your charge.
Latency investigation	Share the `request_id` — support can check the server-side duration.

Best practices

Always log request_id from every response, not just errors. This lets you trace successful requests too.
Log the timestamp alongside the request ID. Time context speeds up log searches.
Pass your own X-Request-Id if you have an internal tracing system (e.g., OpenTelemetry trace ID).

import logging

logger = logging.getLogger("routic")

def call_api(messages):
    response = client.chat.completions.create(
        model="deepseek-r1",
        messages=messages,
    )
    rid = response._request_id  # or from raw response headers
    logger.info(f"request_id={rid} model={response.model} tokens={response.usage.total_tokens}")
    return response

Monitoring

Health check

The API exposes a health endpoint:

GET https://api.routic.ai/healthz

Returns 200 OK when the service is running. This endpoint does not check database or cache connectivity — it only confirms the API process is alive.

What you can monitor

Metric	How to observe
API availability	Poll `GET /healthz` at regular intervals.
Request latency	Check `X-Request-Id` + timestamp in your logs.
Token consumption	Inspect `usage` in each response, or check dashboard.
Rate limit proximity	Watch for 429 responses — you're near the limit.
Error rate	Track non-200 status codes across requests.

Alerting suggestions

429 rate: If > 5% of requests return 429, consider requesting higher limits.
5xx rate: If > 1% of requests return 500/503, contact support.
Latency spike: If p99 latency exceeds 30 seconds, check model provider status.