Streaming SSE

The iNwealth API returns responses as Server-Sent Events (SSE). This allows you to display the answer progressively as it’s generated, just like ChatGPT or Claude.

Event types

Event	Description
`delta`	Text or thinking fragment — append to your display
`metadata`	Metadata (sources, markers)
`done`	Stream complete with final message
`error`	Error during streaming
`clarification`	Interactive clarifying questions — pause stream, collect user answers (V170)

Raw SSE format

event: delta
data: {"type": "delta", "delta": {"content": "The Plan d'Epargne "}}

event: delta
data: {"type": "delta", "delta": {"content": "Retraite (PER) offers "}}

event: delta
data: {"type": "delta", "delta": {"content": "several tax advantages..."}}

event: metadata
data: {"type": "metadata", "data": {"markers": [...]}}

event: done
data: {"type": "done", "final_message": {"role": "assistant", "content": "The Plan d'Epargne Retraite (PER) offers several tax advantages..."}}

Extended thinking deltas

When extended_thinking is enabled, delta events may include thinking data instead of content:

event: delta
data: {"type": "delta", "delta": {"thinking": "Analyzing PER tax deductions...", "phase": "RECHERCHE", "step": "Tax deduction limits"}}

Field	Description
`delta.content`	Text fragment of the response
`delta.thinking`	Thinking fragment (when `extended_thinking` is on)
`delta.phase`	Current thinking phase: `"RECHERCHE"` or `"RAISONNEMENT"`
`delta.step`	Current thinking step (human-readable)
`delta.content_reset`	If `true`, clear accumulated content (intermediate text before final answer)

JavaScript (fetch + ReadableStream)

const response = await fetch("https://api.inwealth.fr/api/agent", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": "Bearer sk_live_your_key_here"
  },
  body: JSON.stringify({
    messages: [{ role: "user", content: "Tax advantages of PER?" }],
    session_id: "session-001"
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();
let fullText = "";

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const chunk = decoder.decode(value);
  for (const line of chunk.split("\n")) {
    if (!line.startsWith("data: ")) continue;
    const data = JSON.parse(line.slice(6));

    if (data.type === "delta" && data.delta?.content) {
      fullText += data.delta.content;
      updateDisplay(fullText);  // Update your UI progressively
    } else if (data.type === "delta" && data.delta?.content_reset) {
      fullText = "";  // Clear intermediate text before final answer
    } else if (data.type === "error") {
      console.error(data.data.error);
      break;
    }
  }
}

Python (httpx)

import httpx
import json

with httpx.stream(
    "POST",
    "https://api.inwealth.fr/api/agent",
    headers={"Authorization": "Bearer sk_live_your_key_here"},
    json={
        "messages": [{"role": "user", "content": "Tax advantages of PER?"}],
        "session_id": "session-001",
    },
) as response:
    full_text = ""
    for line in response.iter_lines():
        if not line.startswith("data: "):
            continue
        data = json.loads(line[6:])
        if data.get("type") == "delta":
            content = data.get("delta", {}).get("content", "")
            if content:
                full_text += content
                print(content, end="", flush=True)
        elif data.get("type") == "error":
            print(f"\nError: {data['data']['error']}")
            break

Multi-turn conversation

The API is stateless. Send the full conversation history in messages:

{
  "messages": [
    {"role": "user", "content": "What is the PER?"},
    {"role": "assistant", "content": "The PER is a retirement savings plan..."},
    {"role": "user", "content": "What about for non-residents?"}
  ],
  "session_id": "session-001",
  "residence": "ch"
}

Reuse the same session_id across turns to benefit from Anthropic’s prompt caching, reducing latency and token costs.

For long conversations (12+ turns), the API automatically optimizes the conversation history to stay within model limits. The most recent messages are always preserved in full. This is transparent — no action needed on your side.

Getting Started

Endpoints

Guides

Streaming SSE

Streaming SSE

Event types

Raw SSE format

Extended thinking deltas

JavaScript (fetch + ReadableStream)

Python (httpx)

Multi-turn conversation

Getting Started

Endpoints

Guides

Documentation Index

​Streaming SSE

​Event types

​Raw SSE format

​Extended thinking deltas

​JavaScript (fetch + ReadableStream)

​Python (httpx)

​Multi-turn conversation

Streaming SSE

Event types

Raw SSE format

Extended thinking deltas

JavaScript (fetch + ReadableStream)

Python (httpx)

Multi-turn conversation