Observability & request IDs

Every request to the Routic API is assigned a unique request ID. Understanding how to use it helps you debug issues faster and communicate effectively with support.

What is a request ID?

A request ID is a unique string that identifies a single API request as it flows through the system. It connects your client-side error with the server-side logs and billing records.

Format: req_ followed by 32 hexadecimal characters (e.g., req_a1b2c3d4e5f6789012345678abcdef01).

How to get a request ID

From the response header

Every API response includes the X-Request-Id header:

HTTP/1.1 200 OK
X-Request-Id: req_a1b2c3d4e5f6789012345678abcdef01
Content-Type: application/json

From the error payload

Error responses include request_id in the JSON body:

{
  "error": {
    "message": "RPM limit exceeded. Try again in 5 seconds.",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded",
    "request_id": "req_a1b2c3d4e5f6789012345678abcdef01"
  }
}

Provide your own request ID

You can pass a X-Request-Id header in your request. Routic will use your value and echo it back in the response. This is useful for correlating your internal tracing with Routic's logs.

import openai

client = openai.OpenAI(
    base_url="https://api.routic.ai/v1",
    api_key="sk-xxxxxxxx",
    default_headers={"X-Request-Id": "my-trace-12345"},
)

If you don't provide one, Routic generates one automatically.


How request IDs flow through the system

Your app → Routic API → Gateway → Model provider
   │            │           │           │
   └─ X-Request-Id ─┴── logs ──┴── spend log ──┘
                                    │
                              usage_records
                          (gateway_request_id)
  1. Client sends the request (with or without X-Request-Id)
  2. Routic API assigns or preserves the ID, logs it with the access log
  3. Gateway processes the request and records the ID in spend logs
  4. Usage sync writes the ID as gateway_request_id into usage records
  5. Billing links usage records to your account via this ID

This means the same request ID can be used to trace:

  • API access logs (method, path, status, duration)
  • Gateway spend logs (model, tokens, cost)
  • Your billing records (usage, charges)

Debugging with request IDs

When to use a request ID

ScenarioWhat to do
Unexpected errorCopy the request_id from the error response and share with support.
Missing usageProvide the request_id and approximate time — support can trace whether it was recorded.
Billing discrepancyShare the request_id so support can compare gateway cost vs. your charge.
Latency investigationShare the request_id — support can check the server-side duration.

Best practices

  • Always log request_id from every response, not just errors. This lets you trace successful requests too.
  • Log the timestamp alongside the request ID. Time context speeds up log searches.
  • Pass your own X-Request-Id if you have an internal tracing system (e.g., OpenTelemetry trace ID).
import logging

logger = logging.getLogger("routic")

def call_api(messages):
    response = client.chat.completions.create(
        model="deepseek-r1",
        messages=messages,
    )
    rid = response._request_id  # or from raw response headers
    logger.info(f"request_id={rid} model={response.model} tokens={response.usage.total_tokens}")
    return response

Monitoring

Health check

The API exposes a health endpoint:

GET https://api.routic.ai/healthz

Returns 200 OK when the service is running. This endpoint does not check database or cache connectivity — it only confirms the API process is alive.

What you can monitor

MetricHow to observe
API availabilityPoll GET /healthz at regular intervals.
Request latencyCheck X-Request-Id + timestamp in your logs.
Token consumptionInspect usage in each response, or check dashboard.
Rate limit proximityWatch for 429 responses — you're near the limit.
Error rateTrack non-200 status codes across requests.

Alerting suggestions

  • 429 rate: If > 5% of requests return 429, consider requesting higher limits.
  • 5xx rate: If > 1% of requests return 500/503, contact support.
  • Latency spike: If p99 latency exceeds 30 seconds, check model provider status.

See also