Streaming & Real-Time Output

Your agent UI shows nothing for 45 seconds, then dumps the entire response at once. Users think it’s frozen. The fix is one flag — --output-format stream-json — and a few lines of NDJSON parsing.

The stream-json protocol turns the Claude CLI into a real-time event emitter. Instead of waiting for the entire response, you get NDJSON events as they happen — init metadata, token-by-token text, tool calls, tool results, rate limits, and the final summary. NDJSON (Newline Delimited JSON) means one JSON object per line. Each event is a complete JSON object separated by \n. Combined with --input-format stream-json, it enables bidirectional communication for building real-time UIs, agent chains, and monitoring dashboards.

You need this chapter if you’re: building custom UIs, implementing real-time progress bars, creating dashboards, or debugging stream behavior. For basic automation, JSON mode is simpler.

Enabling Streaming

Two flags are required to activate the stream protocol: --output-format stream-json sets the output mode, and --verbose ensures the init event with full session metadata is emitted. Without --verbose, the init event may be missing — always pair the two together.

Streaming a simple prompt

$ claude -p "Explain recursion" --output-format stream-json --verbose

{"type": "system","subtype": "init","model": "claude-opus-4-6","session_id": "380bd0cd-...","tools":["Bash","Read","Edit","..."]}

{"type": "assistant","message":{"content":[{"type": "text","text": "Recursion is a technique where..."}]}}

{"type": "rate_limit_event","rate_limit_info":{"status": "allowed","resetsAt": 1773810000}}

{"type": "result","subtype": "success","duration_ms": 3216,"total_cost_usd": 0.01590}

Each line in the output is a complete, independent JSON object. A simple prompt without tool use produces exactly four events: system (init), assistant (response), rate_limit_event (API status), and result (final envelope).

Event Types

Every event has a type field that tells you what it represents. Most sessions produce 5 core events: session_started → input_sent → started_working → output_message → completed. The others (tool_use, tool_result, etc.) appear conditionally based on tool usage, retries, and partial message streaming.

Here is the complete catalog:

Stream Event Types

Event Type	Subtype / Trigger	When It Fires	Key Fields
`system`	`init`	First event in every stream	`tools`, `mcp_servers`, `model`, `session_id`, `permissionMode`
`system`	`api_retry`	On retryable API error (rate limit, server error)	`attempt`, `max_retries`, `retry_delay_ms`, `error`
`assistant`	—	After the model generates a response	`message.content[]`, `message.usage`
`user`	—	After a tool executes and returns results	Tool result content
`rate_limit_event`	—	After each API call (always fires, even when allowed)	`rate_limit_info.status`, `resetsAt`
`result`	`success` or `error_*`	Last event in the stream	Same fields as `—output-format json` envelope
`stream_event`	`message_start`, `content_block_delta`, `message_stop`, …	Only with `—include-partial-messages`	`event.delta.text` — incremental token chunks

When Claude uses tools, events interleave: an assistant event contains the tool call (content[].type === "tool_use"), followed by a user event with the tool result. This cycle repeats for each tool invocation before the final assistant response.

The interactive visualizer below replays a real streaming session. Watch how events arrive in order and build up the response.

Stream VisualizerNDJSON REPLAY

Include partial messagesEvent 0 of 4

Press Play to start the stream

▸ Try This

See the stream events yourself:

claude -p “Say hello” —output-format stream-json 2>/dev/null | head -5

Notice the first event is init with the session ID, and subsequent events are assistant with text chunks. Try piping through jq -r ‘select(.type==“assistant”) | .content’ to extract just the text.

Init Event

The first event in every stream is a system/init payload containing session metadata. This is the richest event in the protocol and the key to initializing any UI or monitoring system.

System Init Eventartifacts/16/stream_basic.jsonl

2 "type": "system",

3 "subtype": "init",

4 "cwd": "/Users/tunaozmen/claude-cli-mastery",← A

5 "session_id": "380bd0cd-2017-414d-b3c3-2101041c4d3b",← B

6 "tools": [← C

7 "Task",

8 "Bash",

9 "Glob",

10 "Grep",

11 "Read",← D

12 "Edit",

13 "Write",← E

14 "..."

15 ],

16 "mcp_servers": [

17 {

18 "name": "chrome-devtools",

19 "status": "connected"

20 },

21 {

22 "name": "claude.ai Notion",

23 "status": "connected"

24 },

25 {

26 "name": "claude.ai Gmail",

27 "status": "needs-auth"

28 }

29 ],

30 "model": "claude-opus-4-6",

31 "permissionMode": "default",

32 "claude_code_version": "2.1.74",

33 "agents": [

34 "general-purpose",

35 "Explore",

36 "Plan",

37 "claude-code-guide"

38 ],

39 "skills": [

40 "debug",

41 "simplify",

42 "batch",

43 "loop",

44 "claude-api"

45 ],

46 "plugins": [

47 {

48 "name": "rust-analyzer-lsp"

49 }

50 ],

51 "fast_mode_state": "off"

52}

AUUID for this conversation — pass to --resume to continue later

BAll available tools — verify --allowedTools restrictions took effect

CMCP server health — check status for 'connected' vs 'needs-auth'

DWhich model is running — useful for version pinning

ECLI version — check compatibility in CI pipelines

Use the init event to power health checks (verify MCP server connections), tool audits (confirm allowlists), version pinning (check claude_code_version), and UI initialization (populate tool lists and model names).

Partial Messages

By default, you get the complete response in a single assistant event. Adding --include-partial-messages inserts stream_event entries for each token chunk, enabling typewriter-style UIs that render text as it is generated.

Typewriter effect with partial messages

$ claude -p "Say hello" --output-format stream-json --verbose --include-partial-messages | \

$ jq -rj 'select(.type == "stream_event" and .event.delta.type? == "text_delta") | .event.delta.text'

# token-by-token output:

Hello! How can I help you today?

The sequence of stream_event entries follows a strict order:

message_start — model ID, empty content
content_block_start — new text block begins (index 0)
content_block_delta — incremental tokens: "\n\nHello", then "! How", then " can I help you today?"
content_block_stop — block finished
message_delta — stop_reason: "end_turn", final usage stats
message_stop — stream complete

The full assistant event (with the complete assembled message) appears after all the deltas. For a typewriter UI, consume content_block_delta events in real time — do not wait for the assistant event.

A Single Token Delta Event

2 "type": "stream_event",

3 "event": {

4 "type": "content_block_delta",← A

5 "index": 0,

6 "delta": {

7 "type": "text_delta",← B

8 "text": "! How"

9 }

10 },

11 "session_id": "c5d6f705-4fab-46a5-97af-49ad226a5593"

12}

Acontent_block_delta carries each token chunk

BConcatenate each delta.text to build the full response

Note

—include-partial-messages is high volume. A 500-word response can generate hundreds of stream_event lines. Only enable it for typewriter UIs, not data pipelines where you only need the final result.

Bidirectional Communication

The stream protocol is not output-only. Adding --input-format stream-json lets you send messages to Claude via stdin while receiving events on stdout — enabling real-time UIs, agent orchestration systems, and custom IDE integrations.

# Full bidirectional mode
claude -p --input-format stream-json --output-format stream-json --verbose

There are strict pairing rules for these flags:

Input/Output Flag Requirements

Flag	Requires	Purpose
`—input-format stream-json`	`—output-format stream-json`	Accept stream-json on stdin for follow-up messages
`—replay-user-messages`	Both `—input-format` and `—output-format` set to `stream-json`	Echo user messages back on stdout for acknowledgment

You can also chain Claude instances by piping stream-json output from one into another:

# First Claude analyzes, second Claude summarizes
claude -p "Analyze auth.py" --output-format stream-json --verbose | \
  claude -p "Summarize the analysis" --input-format stream-json --output-format stream-json --verbose

The second instance receives all events from the first — including tool results and the final response — as context for its own task.

Real Stream Payload

Here is the complete four-event stream from a real CLI call. This is exactly what --output-format stream-json --verbose produces for a simple prompt with no tool use.

Complete 4-Event Streamartifacts/16/stream_basic.jsonl

2 "event_1_init": {← 1

3 "type": "system",

4 "subtype": "init",

5 "session_id": "380bd0cd-2017-414d-b3c3-2101041c4d3b",

6 "model": "claude-opus-4-6",

7 "tools": [

8 "Task",

9 "Bash",

10 "Glob",

11 "Grep",

12 "Read",

13 "Edit",

14 "Write"← 2

15 ],

16 "mcp_servers": [

17 {

18 "name": "chrome-devtools",

19 "status": "connected"

20 },

21 {

22 "name": "claude.ai Notion",

23 "status": "connected"

24 }

25 ],

26 "permissionMode": "default",

27 "claude_code_version": "2.1.74"← 3

28 },

29 "event_2_assistant": {

30 "type": "assistant",

31 "message": {

32 "model": "claude-opus-4-6",

33 "role": "assistant",

34 "content": [

35 {← 4

36 "type": "text",

37 "text": "test stream"

38 }

39 ],

40 "usage": {

41 "input_tokens": 3,

42 "cache_creation_input_tokens": 1381,

43 "cache_read_input_tokens": 14253,

44 "output_tokens": 1

45 }

46 }

47 },

48 "event_3_rate_limit": {

49 "type": "rate_limit_event",

50 "rate_limit_info": {

51 "status": "allowed",

52 "resetsAt": 1773810000,

53 "rateLimitType": "five_hour",

54 "overageStatus": "rejected"

55 }

56 },

57 "event_4_result": {

58 "type": "result",

59 "subtype": "success",

60 "duration_ms": 3216,

61 "num_turns": 1,

62 "result": "test stream",

63 "total_cost_usd": 0.0159,

64 "session_id": "380bd0cd-2017-414d-b3c3-2101041c4d3b",

65 "usage": {

66 "input_tokens": 3,

67 "cache_creation_input_tokens": 0,

68 "cache_read_input_tokens": 14253,

69 "output_tokens": 8

70 }

71 }

72}

1Init event — always first, contains session metadata

2Assistant event — the complete model response

3Rate limit event — always fires, check status field

4Result event — always last, same as --output-format json

In a real stream, each of these four objects would be on its own line with no commas or brackets between them. The nested structure shown above groups them for readability — the actual NDJSON output is one JSON object per line.

Parsing Stream Events

Here is a Node.js generator that spawns a Claude process and yields parsed events as they arrive:

import { spawn } from 'child_process';
import { createInterface } from 'readline';

async function* streamClaude(prompt, { partial = false } = {}) {
    const args = ['-p', prompt, '--output-format', 'stream-json', '--verbose'];
    if (partial) args.push('--include-partial-messages');

    const proc = spawn('claude', args);
    const rl = createInterface({ input: proc.stdout });

    for await (const line of rl) {
        if (!line.trim()) continue; // skip empty lines
        try {
            yield JSON.parse(line);
        } catch (err) {
            console.warn(`Skipping malformed JSON: ${line}`, err);
        }
    }
}

// Dispatch on event type
for await (const event of streamClaude('Explain recursion')) {
    switch (event.type) {
        case 'system':
            if (event.subtype === 'init') {
                console.log(`Model: ${event.model}, Tools: ${event.tools.length}`);
            }
            break;
        case 'assistant':
            console.log(`Response: ${event.message.content[0].text}`);
            break;
        case 'result':
            console.log(`Cost: $${event.total_cost_usd.toFixed(4)}`);
            break;
    }
}

Production parsers should handle: incomplete lines (buffer until \n), malformed JSON (log and skip), unexpected event types (warn but continue). The example above shows basic error handling — add buffering logic if processing stdin in chunks.

For a typewriter effect, enable partial messages and filter for text deltas:

process.stdout.write('Claude says: ');
for await (const event of streamClaude('Write a haiku', { partial: true })) {
    if (event.type === 'stream_event') {
        const delta = event.event?.delta;
        if (delta?.type === 'text_delta') {
            process.stdout.write(delta.text);
        }
    }
}
console.log();

Retry Events

When the API returns a retryable error (rate limit or server error), a system/api_retry event appears before the retry attempt. Use this to show retry progress in your UI or implement custom backoff logic.

{
    "type": "system",
    "subtype": "api_retry",
    "attempt": 1,
    "max_retries": 5,
    "retry_delay_ms": 2000,
    "error_status": 529,
    "error": "server_error"
}

Gotcha

—verbose is required with —output-format stream-json. Without it, the init event with session metadata may not be emitted. Always use both flags together: —output-format stream-json —verbose.

Gotcha

Each line is independent JSON — this is NDJSON, not a JSON array. There are no commas between objects and no wrapping brackets. Parse line by line with JSON.parse(line). Calling JSON.parse(entireOutput) on the full stream will fail.

Tip

The result event is always last and contains the same data as —output-format json. You get both real-time streaming events and the final summary envelope in one stream — no need to run a second call to get the result metadata.

See This in Action

See how production code parses NDJSON streams and bridges them to SSE for a real-time UI in Build an MR Reviewer, Part 2: Streaming to SSE.

→ Now Do This

Run claude -p “Explain sessions in one paragraph” —output-format stream-json 2>/dev/null | head -5 to see your first NDJSON events. Notice the init event contains the session ID before any text arrives — you can start tracking the session before Claude even begins responding.