Skip to main content

Troubleshooting

Authentication issues

Cause: The Authorization header is missing or malformed.Fix: Ensure your request includes the header in the correct format:
Authorization: Bearer sk_live_your_key_here
Common mistakes:
  • Missing the Bearer prefix (with space)
  • Extra whitespace or newline in the key
  • Sending the key as a query parameter instead of a header
Cause: The API key does not match any active key in our system.Fix:
  • Verify you are using the correct key (starts with sk_live_)
  • Check that the key has not been revoked — contact your account manager
  • If you lost your key, request a new one (keys are shown only once at generation)

Request issues

Cause: A required field is missing from the request body.Fix: Ensure your JSON body includes both required fields:
{
  "messages": [{"role": "user", "content": "Your question"}],
  "session_id": "your-session-id"
}
Cause: The effort parameter has an unsupported value.Fix: Use one of: "low", "medium", "high", "max", or omit the field entirely (defaults to auto).
Possible causes:
  • Your SSE parser is not handling multi-line chunks correctly
  • The connection was closed prematurely (timeout)
Fix:
  • Buffer incoming chunks and split on \n\n (double newline) to separate SSE events
  • Increase your HTTP client timeout — complex queries with extended_thinking can take 30-60 seconds
  • Listen for the done event to confirm the stream completed

Rate limiting

Cause: Your organization has consumed its daily token allowance.Fix:
  • Wait until midnight UTC for the quota to reset
  • The error message includes the exact reset time
  • Contact your account manager to increase your daily limit
Prevention:
  • Use effort: "low" for simple questions
  • Reuse session_id across turns to activate prompt caching
  • Trim conversation history for long sessions
  • Only enable extended_thinking when needed

Performance

Cause: The API container was idle and is booting up.How to detect: Call GET /health — if cold_start is true, the container just started.Fix:
  • The first request after a cold start may take a few extra seconds — this is normal
  • Subsequent requests will be fast
  • If you need consistent low latency, schedule periodic health checks to keep the container warm
Cause: extended_thinking: true adds a reasoning phase before the answer, which takes longer.Fix:
  • Only use extended_thinking for complex or cross-border questions
  • For simple factual queries, leave it disabled (default false)
  • Combine with effort: "low" or "medium" to limit analysis depth

Still stuck?

Contact your iNwealth account manager or reach out to our technical team for assistance.