You saw the JSON envelope in Your First CLI Call — the wrapper that turns a plain text response into structured, parseable data. This chapter dissects every field in that envelope so you know exactly what to extract, what to log, and what to watch out for when building automation on top of --output-format json.
Top-Level Fields
Every --output-format json response returns the same set of top-level keys. Some are essential for every script; others only matter in specific scenarios like budget enforcement or permission auditing.
Complete Field Reference
| Field | Type | What It Tells You |
|---|---|---|
type | string | Always “result” for a completed response. In stream-json mode, other event types appear first, but the final line always has this type. |
subtype | string | ”success” on normal completion. “error_max_budget_usd” when the call hit a spending cap set via —max-budget-usd. |
is_error | boolean | false on success. Check this first in every response handler before reading result. |
duration_ms | number | Wall-clock time for the entire call, including tool execution. Use this for end-to-end latency tracking. |
duration_api_ms | number | Time spent in the API call only, excluding local tool execution. The delta between duration_ms and duration_api_ms tells you how long tools took. |
num_turns | number | Number of conversation turns in this call. A single prompt-response is 1. Multi-turn tool use loops increase this count. |
result | string | The text response — identical to what —output-format text returns. Always a string, even when the response involved multiple tool calls. |
stop_reason | string | ”end_turn” means Claude finished naturally. “max_tokens” means the output was truncated at the token limit. |
session_id | string | UUID for this conversation. Pass it to —resume to continue the conversation in a later call. |
total_cost_usd | number | Cost in USD for this call. In a resumed session, this reflects the cumulative cost across all turns in the session. |
usage | object | Flat token breakdown: input_tokens, output_tokens, cache fields, server_tool_use, and service_tier. |
modelUsage | object | Per-model token breakdown, keyed by model ID. Essential when using —fallback-model to see which model actually answered. |
permission_denials | array | Tools that the permission system blocked from executing. Empty array means nothing was denied. Critical for debugging plan mode workflows. |
fast_mode_state | string | ”off” or “on”. Indicates whether Claude Code’s fast mode (using a smaller model for simple tasks) was active. |
uuid | string | Unique identifier for this specific response. Different from session_id — a session can contain many responses, each with its own uuid. |
The result Field
The result field is always a string containing Claude’s final text response. This holds true regardless of what happened during the call — even if Claude invoked multiple tools, read files, wrote code, and ran bash commands across several turns, the result is the final text output collapsed into a single string.
When stop_reason is "end_turn", the response is complete. When it is "max_tokens", Claude hit the output token limit and the response was truncated. You can detect truncation by checking this field and decide whether to resume the session for a continuation:
# Check for truncation and resume if neededRESPONSE=$(claude -p "Write a long essay" --output-format json)STOP=$(echo "$RESPONSE" | jq -r '.stop_reason')SESSION=$(echo "$RESPONSE" | jq -r '.session_id')
if [ "$STOP" = "max_tokens" ]; then claude -p "Continue" --resume "$SESSION" --output-format jsonfiThe result field is always a string, even when Claude used multiple tools during the response. You will not find structured tool output here — only the final text that Claude composed after all tool calls completed. To see individual tool invocations and their results, use stream-json mode instead.
Usage and Cost Fields
The envelope contains two levels of token accounting: usage for the flat aggregate and modelUsage for per-model detail.
The usage Object
"usage": { "input_tokens": 3, "cache_creation_input_tokens": 0, "cache_read_input_tokens": 15635, "output_tokens": 6, "server_tool_use": { "web_search_requests": 0, "web_fetch_requests": 0 }, "service_tier": "standard", "cache_creation": { "ephemeral_1h_input_tokens": 0, "ephemeral_5m_input_tokens": 0 }}The input_tokens count reflects only the non-cached tokens sent to the API. The cache_read_input_tokens count shows how many tokens were served from the prompt cache — these are significantly cheaper. On a first call you will see cache_creation_input_tokens spike as the system prompt gets cached; on subsequent calls, cache_read_input_tokens takes over.
The server_tool_use sub-object tracks built-in server-side tools like web search and web fetch. The service_tier field confirms which API tier handled the request — typically "standard".
The modelUsage Object
"modelUsage": { "claude-opus-4-6": { "inputTokens": 3, "outputTokens": 6, "cacheReadInputTokens": 15635, "cacheCreationInputTokens": 0, "costUSD": 0.0079825, "contextWindow": 200000, "maxOutputTokens": 32000 }}This object is keyed by model ID, making it essential when you use --fallback-model. If Claude falls back to a different model, you will see two entries here — one per model — each with its own token counts and cost. The contextWindow and maxOutputTokens fields tell you the model’s limits, useful for validating that your prompts fit within the window.
Note the naming convention difference: usage uses snake_case (input_tokens) while modelUsage uses camelCase (inputTokens). Keep this in mind when writing parsers.
The total_cost_usd field is cumulative across the entire session when using —resume. If you call Claude, then resume the session for a follow-up, the second response’s total_cost_usd includes the cost of the first call too. To get the cost of just the latest turn, subtract the previous response’s total_cost_usd from the current one — or use the per-model costUSD in modelUsage.
Error Handling
Errors surface through two fields working together: is_error and subtype. A successful response always has is_error: false and subtype: "success". When something goes wrong, the pattern changes.
The most common error subtype is "error_max_budget_usd", which fires when a call exceeds the budget cap you set with --max-budget-usd. In this case, the result field contains the error message instead of a normal response:
# Set a tight budget to see the error envelope$ claude -p "Write a novel" --max-budget-usd 0.001 --output-format json{ "type": "result", "subtype": "error_max_budget_usd", "is_error": true, "result": "Budget exceeded: the request cost more than the allowed $0.001", "total_cost_usd": 0.001, "session_id": "..."}The defensive pattern is straightforward — always check is_error before accessing result:
const data = JSON.parse(output);if (data.is_error) { console.error(`Error (${data.subtype}): ${data.result}`); process.exit(1);}// Safe to use data.result hereThe permission_denials array complements error handling for tool-use scenarios. When Claude attempts to use a tool that the permission system blocks, the tool name appears in this array. The call itself may still succeed (with is_error: false), but Claude’s response will reflect that it could not perform the denied action. In plan mode workflows, checking this array is critical — it tells you what Claude wanted to do but was not allowed to.
Full Annotated Envelope
Extract any field on the command line with jq. For cost tracking across a batch of calls: claude -p ”…” —output-format json | jq ‘.total_cost_usd’. For session chaining: jq -r ‘.session_id’ and pass it to —resume on the next call.