Observability & request IDs
Every request to the Routic API is assigned a unique request ID. Understanding how to use it helps you debug issues faster and communicate effectively with support.
What is a request ID?
A request ID is a unique string that identifies a single API request as it flows through the system. It connects your client-side error with the server-side logs and billing records.
Format: req_ followed by 32 hexadecimal characters (e.g., req_a1b2c3d4e5f6789012345678abcdef01).
How to get a request ID
From the response header
Every API response includes the X-Request-Id header:
HTTP/1.1 200 OK
X-Request-Id: req_a1b2c3d4e5f6789012345678abcdef01
Content-Type: application/json
From the error payload
Error responses include request_id in the JSON body:
{
"error": {
"message": "RPM limit exceeded. Try again in 5 seconds.",
"type": "rate_limit_error",
"code": "rate_limit_exceeded",
"request_id": "req_a1b2c3d4e5f6789012345678abcdef01"
}
}
Provide your own request ID
You can pass a X-Request-Id header in your request. Routic will use your value and echo it back in the response. This is useful for correlating your internal tracing with Routic's logs.
import openai
client = openai.OpenAI(
base_url="https://api.routic.ai/v1",
api_key="sk-xxxxxxxx",
default_headers={"X-Request-Id": "my-trace-12345"},
)
If you don't provide one, Routic generates one automatically.
How request IDs flow through the system
Your app → Routic API → Gateway → Model provider
│ │ │ │
└─ X-Request-Id ─┴── logs ──┴── spend log ──┘
│
usage_records
(gateway_request_id)
- Client sends the request (with or without
X-Request-Id) - Routic API assigns or preserves the ID, logs it with the access log
- Gateway processes the request and records the ID in spend logs
- Usage sync writes the ID as
gateway_request_idinto usage records - Billing links usage records to your account via this ID
This means the same request ID can be used to trace:
- API access logs (method, path, status, duration)
- Gateway spend logs (model, tokens, cost)
- Your billing records (usage, charges)
Debugging with request IDs
When to use a request ID
| Scenario | What to do |
|---|---|
| Unexpected error | Copy the request_id from the error response and share with support. |
| Missing usage | Provide the request_id and approximate time — support can trace whether it was recorded. |
| Billing discrepancy | Share the request_id so support can compare gateway cost vs. your charge. |
| Latency investigation | Share the request_id — support can check the server-side duration. |
Best practices
- Always log
request_idfrom every response, not just errors. This lets you trace successful requests too. - Log the timestamp alongside the request ID. Time context speeds up log searches.
- Pass your own
X-Request-Idif you have an internal tracing system (e.g., OpenTelemetry trace ID).
import logging
logger = logging.getLogger("routic")
def call_api(messages):
response = client.chat.completions.create(
model="deepseek-r1",
messages=messages,
)
rid = response._request_id # or from raw response headers
logger.info(f"request_id={rid} model={response.model} tokens={response.usage.total_tokens}")
return response
Monitoring
Health check
The API exposes a health endpoint:
GET https://api.routic.ai/healthz
Returns 200 OK when the service is running. This endpoint does not check database or cache connectivity — it only confirms the API process is alive.
What you can monitor
| Metric | How to observe |
|---|---|
| API availability | Poll GET /healthz at regular intervals. |
| Request latency | Check X-Request-Id + timestamp in your logs. |
| Token consumption | Inspect usage in each response, or check dashboard. |
| Rate limit proximity | Watch for 429 responses — you're near the limit. |
| Error rate | Track non-200 status codes across requests. |
Alerting suggestions
- 429 rate: If > 5% of requests return 429, consider requesting higher limits.
- 5xx rate: If > 1% of requests return 500/503, contact support.
- Latency spike: If p99 latency exceeds 30 seconds, check model provider status.