Stream Protocol

NDJSON streaming events, partial messages, and stream processing

The stream-json protocol turns the Claude CLI into a real-time event emitter. Instead of waiting for the entire response, you get NDJSON events as they happen — init metadata, token-by-token text, tool calls, tool results, rate limits, and the final summary. Combined with --input-format stream-json, it enables bidirectional communication for building real-time UIs, agent chains, and monitoring dashboards.

Enabling Streaming

Two flags are required to activate the stream protocol: --output-format stream-json sets the output mode, and --verbose ensures the init event with full session metadata is emitted. Without --verbose, the init event may be missing — always pair the two together.

Streaming a simple prompt
$

Each line in the output is a complete, independent JSON object. A simple prompt without tool use produces exactly four events: system (init), assistant (response), rate_limit_event (API status), and result (final envelope).

Event Types

Every event has a type field that tells you what it represents. Here is the complete catalog:

Stream Event Types

Event TypeSubtype / TriggerWhen It FiresKey Fields
systeminitFirst event in every streamtools, mcp_servers, model, session_id, permissionMode
systemapi_retryOn retryable API error (rate limit, server error)attempt, max_retries, retry_delay_ms, error
assistantAfter the model generates a responsemessage.content[], message.usage
userAfter a tool executes and returns resultsTool result content
rate_limit_eventAfter each API call (always fires, even when allowed)rate_limit_info.status, resetsAt
resultsuccess or error_*Last event in the streamSame fields as —output-format json envelope
stream_eventmessage_start, content_block_delta, message_stop, …Only with —include-partial-messagesevent.delta.text — incremental token chunks

When Claude uses tools, events interleave: an assistant event contains the tool call (content[].type === "tool_use"), followed by a user event with the tool result. This cycle repeats for each tool invocation before the final assistant response.

Init Event

The first event in every stream is a system/init payload containing session metadata. This is the richest event in the protocol and the key to initializing any UI or monitoring system.

System Init Eventartifacts/16/stream_basic.jsonl
1{
2 "type": "system",
3 "subtype": "init",
4 "cwd": "/Users/tunaozmen/claude-cli-mastery",A
5 "session_id": "380bd0cd-2017-414d-b3c3-2101041c4d3b",B
6 "tools": [C
7 "Task",
8 "Bash",
9 "Glob",
10 "Grep",
11 "Read",D
12 "Edit",
13 "Write",E
14 "..."
15 ],
16 "mcp_servers": [
17 {
18 "name": "chrome-devtools",
19 "status": "connected"
20 },
21 {
22 "name": "claude.ai Notion",
23 "status": "connected"
24 },
25 {
26 "name": "claude.ai Gmail",
27 "status": "needs-auth"
28 }
29 ],
30 "model": "claude-opus-4-6",
31 "permissionMode": "default",
32 "claude_code_version": "2.1.74",
33 "agents": [
34 "general-purpose",
35 "Explore",
36 "Plan",
37 "claude-code-guide"
38 ],
39 "skills": [
40 "debug",
41 "simplify",
42 "batch",
43 "loop",
44 "claude-api"
45 ],
46 "plugins": [
47 {
48 "name": "rust-analyzer-lsp"
49 }
50 ],
51 "fast_mode_state": "off"
52}
AUUID for this conversation — pass to --resume to continue later
BAll available tools — verify --allowedTools restrictions took effect
CMCP server health — check status for 'connected' vs 'needs-auth'
DWhich model is running — useful for version pinning
ECLI version — check compatibility in CI pipelines

Use the init event to power health checks (verify MCP server connections), tool audits (confirm allowlists), version pinning (check claude_code_version), and UI initialization (populate tool lists and model names).

Partial Messages

By default, you get the complete response in a single assistant event. Adding --include-partial-messages inserts stream_event entries for each token chunk, enabling typewriter-style UIs that render text as it is generated.

Typewriter effect with partial messages
$

The sequence of stream_event entries follows a strict order:

  1. message_start — model ID, empty content
  2. content_block_start — new text block begins (index 0)
  3. content_block_delta — incremental tokens: "\n\nHello", then "! How", then " can I help you today?"
  4. content_block_stop — block finished
  5. message_deltastop_reason: "end_turn", final usage stats
  6. message_stop — stream complete

The full assistant event (with the complete assembled message) appears after all the deltas. For a typewriter UI, consume content_block_delta events in real time — do not wait for the assistant event.

A Single Token Delta Event
1{
2 "type": "stream_event",
3 "event": {
4 "type": "content_block_delta",A
5 "index": 0,
6 "delta": {
7 "type": "text_delta",B
8 "text": "! How"
9 }
10 },
11 "session_id": "c5d6f705-4fab-46a5-97af-49ad226a5593"
12}
Acontent_block_delta carries each token chunk
BConcatenate each delta.text to build the full response
Note

—include-partial-messages is high volume. A 500-word response can generate hundreds of stream_event lines. Only enable it for typewriter UIs, not data pipelines where you only need the final result.

Bidirectional Communication

The stream protocol is not output-only. Adding --input-format stream-json lets you send messages to Claude via stdin while receiving events on stdout — enabling real-time UIs, agent orchestration systems, and custom IDE integrations.

Terminal window
# Full bidirectional mode
claude -p --input-format stream-json --output-format stream-json --verbose

There are strict pairing rules for these flags:

Input/Output Flag Requirements

FlagRequiresPurpose
—input-format stream-json—output-format stream-jsonAccept stream-json on stdin for follow-up messages
—replay-user-messagesBoth —input-format and —output-format set to stream-jsonEcho user messages back on stdout for acknowledgment

You can also chain Claude instances by piping stream-json output from one into another:

Terminal window
# First Claude analyzes, second Claude summarizes
claude -p "Analyze auth.py" --output-format stream-json --verbose | \
claude -p "Summarize the analysis" --input-format stream-json --output-format stream-json --verbose

The second instance receives all events from the first — including tool results and the final response — as context for its own task.

Real Stream Payload

Here is the complete four-event stream from a real CLI call. This is exactly what --output-format stream-json --verbose produces for a simple prompt with no tool use.

Complete 4-Event Streamartifacts/16/stream_basic.jsonl
1{
2 "event_1_init": {1
3 "type": "system",
4 "subtype": "init",
5 "session_id": "380bd0cd-2017-414d-b3c3-2101041c4d3b",
6 "model": "claude-opus-4-6",
7 "tools": [
8 "Task",
9 "Bash",
10 "Glob",
11 "Grep",
12 "Read",
13 "Edit",
14 "Write"2
15 ],
16 "mcp_servers": [
17 {
18 "name": "chrome-devtools",
19 "status": "connected"
20 },
21 {
22 "name": "claude.ai Notion",
23 "status": "connected"
24 }
25 ],
26 "permissionMode": "default",
27 "claude_code_version": "2.1.74"3
28 },
29 "event_2_assistant": {
30 "type": "assistant",
31 "message": {
32 "model": "claude-opus-4-6",
33 "role": "assistant",
34 "content": [
35 {4
36 "type": "text",
37 "text": "test stream"
38 }
39 ],
40 "usage": {
41 "input_tokens": 3,
42 "cache_creation_input_tokens": 1381,
43 "cache_read_input_tokens": 14253,
44 "output_tokens": 1
45 }
46 }
47 },
48 "event_3_rate_limit": {
49 "type": "rate_limit_event",
50 "rate_limit_info": {
51 "status": "allowed",
52 "resetsAt": 1773810000,
53 "rateLimitType": "five_hour",
54 "overageStatus": "rejected"
55 }
56 },
57 "event_4_result": {
58 "type": "result",
59 "subtype": "success",
60 "duration_ms": 3216,
61 "num_turns": 1,
62 "result": "test stream",
63 "total_cost_usd": 0.0159,
64 "session_id": "380bd0cd-2017-414d-b3c3-2101041c4d3b"
65 }
66}
1Init event — always first, contains session metadata
2Assistant event — the complete model response
3Rate limit event — always fires, check status field
4Result event — always last, same as --output-format json

In a real stream, each of these four objects would be on its own line with no commas or brackets between them. The nested structure shown above groups them for readability — the actual NDJSON output is one JSON object per line.

Parsing Stream Events

Here is a Node.js generator that spawns a Claude process and yields parsed events as they arrive:

import { spawn } from 'child_process';
import { createInterface } from 'readline';
async function* streamClaude(prompt, { partial = false } = {}) {
const args = ['-p', prompt, '--output-format', 'stream-json', '--verbose'];
if (partial) args.push('--include-partial-messages');
const proc = spawn('claude', args);
const rl = createInterface({ input: proc.stdout });
for await (const line of rl) {
if (line.trim()) yield JSON.parse(line);
}
}
// Dispatch on event type
for await (const event of streamClaude('Explain recursion')) {
switch (event.type) {
case 'system':
if (event.subtype === 'init') {
console.log(`Model: ${event.model}, Tools: ${event.tools.length}`);
}
break;
case 'assistant':
console.log(`Response: ${event.message.content[0].text}`);
break;
case 'result':
console.log(`Cost: $${event.total_cost_usd.toFixed(4)}`);
break;
}
}

For a typewriter effect, enable partial messages and filter for text deltas:

process.stdout.write('Claude says: ');
for await (const event of streamClaude('Write a haiku', { partial: true })) {
if (event.type === 'stream_event') {
const delta = event.event?.delta;
if (delta?.type === 'text_delta') {
process.stdout.write(delta.text);
}
}
}
console.log();

Retry Events

When the API returns a retryable error (rate limit or server error), a system/api_retry event appears before the retry attempt. Use this to show retry progress in your UI or implement custom backoff logic.

{
"type": "system",
"subtype": "api_retry",
"attempt": 1,
"max_retries": 5,
"retry_delay_ms": 2000,
"error_status": 529,
"error": "server_error"
}
Gotcha

—verbose is required with —output-format stream-json. Without it, the init event with session metadata may not be emitted. Always use both flags together: —output-format stream-json —verbose.

Gotcha

Each line is independent JSON — this is NDJSON, not a JSON array. There are no commas between objects and no wrapping brackets. Parse line by line with JSON.parse(line). Calling JSON.parse(entireOutput) on the full stream will fail.

Tip

The result event is always last and contains the same data as —output-format json. You get both real-time streaming events and the final summary envelope in one stream — no need to run a second call to get the result metadata.