Rate Limits & Quotas
Each organization has a daily token quota that resets at midnight UTC.How it works
- Every request consumes input tokens (your message + conversation history) and output tokens (the agent’s response)
- Both are counted against your daily limit
- When the limit is reached, subsequent requests return a
429error with the time until reset
Error response
When your quota is exceeded, the API returns an SSE error event:Monitoring usage
Contact your iNwealth account manager to:- Check your current consumption
- Adjust your daily token limit
- Get usage reports
Tips to optimize token usage
Use effort wisely
Use
"low" or "medium" effort for simple questions. Reserve "high" and "max" for complex cross-border scenarios.Keep history lean
Only send relevant conversation history in
messages. Trim older turns when the conversation gets long.Reuse session_id
Same
session_id across turns enables prompt caching — up to 90% token savings on multi-turn conversations.Skip extended thinking
Only enable
extended_thinking for questions that truly need deep reasoning.
