Troubleshooting
Authentication issues
401 — Authentification requise
401 — Authentification requise
Cause: The Common mistakes:
Authorization header is missing or malformed.Fix: Ensure your request includes the header in the correct format:- Missing the
Bearerprefix (with space) - Extra whitespace or newline in the key
- Sending the key as a query parameter instead of a header
401 — API key invalide ou revoquee
401 — API key invalide ou revoquee
Cause: The API key does not match any active key in our system.Fix:
- Verify you are using the correct key (starts with
sk_live_) - Check that the key has not been revoked — contact your account manager
- If you lost your key, request a new one (keys are shown only once at generation)
Request issues
422 — Field required
422 — Field required
Cause: A required field is missing from the request body.Fix: Ensure your JSON body includes both required fields:
400 — Invalid effort value
400 — Invalid effort value
Cause: The
effort parameter has an unsupported value.Fix: Use one of: "low", "medium", "high", "max", or omit the field entirely (defaults to auto).Empty or truncated response
Empty or truncated response
Possible causes:
- Your SSE parser is not handling multi-line chunks correctly
- The connection was closed prematurely (timeout)
- Buffer incoming chunks and split on
\n\n(double newline) to separate SSE events - Increase your HTTP client timeout — complex queries with
extended_thinkingcan take 30-60 seconds - Listen for the
doneevent to confirm the stream completed
Rate limiting
429 — Daily quota exceeded
429 — Daily quota exceeded
Cause: Your organization has consumed its daily token allowance.Fix:
- Wait until midnight UTC for the quota to reset
- The error message includes the exact reset time
- Contact your account manager to increase your daily limit
- Use
effort: "low"for simple questions - Reuse
session_idacross turns to activate prompt caching - Trim conversation history for long sessions
- Only enable
extended_thinkingwhen needed
Performance
Slow first response (cold start)
Slow first response (cold start)
Cause: The API container was idle and is booting up.How to detect: Call
GET /health — if cold_start is true, the container just started.Fix:- The first request after a cold start may take a few extra seconds — this is normal
- Subsequent requests will be fast
- If you need consistent low latency, schedule periodic health checks to keep the container warm
Extended thinking takes too long
Extended thinking takes too long
Cause:
extended_thinking: true adds a reasoning phase before the answer, which takes longer.Fix:- Only use
extended_thinkingfor complex or cross-border questions - For simple factual queries, leave it disabled (default
false) - Combine with
effort: "low"or"medium"to limit analysis depth

