Structured Output — Agent Mastered

You ask Claude to classify 200 support tickets. The first 50 return clean JSON. Ticket 51 returns a conversational paragraph instead. Without --json-schema, Claude’s output format is a suggestion, not a guarantee.

The --json-schema flag forces Claude to return data matching a JSON schema. It uses constrained decoding at the token level, producing output that reliably conforms to your schema in practice. Most people don’t know about it, but it’s the cleanest path to reliable data extraction, classification, and API integration.

For anything that needs to parse Claude’s output programmatically, combine --output-format json with --json-schema. The schema constraint gives you predictable, typed structure on every call — no more hoping the output is parseable.

Constrained Decoding Pipeline

When to Use This (and When Not To)

Use --json-schema when:

You are parsing Claude’s output into structured systems (databases, APIs, downstream tools)
You need consistent, typed data across many calls (batch processing, classification pipelines)
Schema compliance is more important than natural explanations

Don’t use --json-schema when:

You need free-form text or Claude’s natural explanations
The output will be read by humans, not machines
Your schema is complex enough that forcing compliance degrades response quality

JSON Schema is a vocabulary for validating JSON structure. The spec defines types, constraints, and annotations that tools use to enforce data shapes. See json-schema.org for the full specification.

Basic Schema

Pass a JSON Schema string to --json-schema alongside --output-format json. Claude will return a response envelope with a structured_output field containing the schema-conforming data.

Entity Extraction with --json-schema

$ claude -p "Extract the person's name and age from: 'John Smith is 32 years old and lives in New York'" \ --output-format json \ --json-schema '{"type":"object","properties":{"name":{"type":"string"},"age":{"type":"integer"},"city":{"type":"string"}},"required":["name","age","city"]}'

{ "type": "result", "subtype": "success", "is_error": false, "result": "", "structured_output": { "name": "John Smith", "age": 32, "city": "New York" }, ... }

Schema in, structured data out. No parsing regex, no “please respond in JSON” prompting. The structured_output field contains the extracted data with correct types — age is the integer 32, not the string "32".

basic_schemaartifacts/08/basic_schema.json

2 "type": "result",

3 "subtype": "success",

4 "is_error": false,

5 "duration_ms": 9144,

6 "duration_api_ms": 9101,

7 "num_turns": 2,

8 "result": "",← A

9 "stop_reason": "end_turn",

10 "session_id": "27c34d79-135c-44c6-b502-0edcd3c3246c",

11 "total_cost_usd": 0.07274925,

12 "usage": {

13 "input_tokens": 4,

14 "cache_creation_input_tokens": 9369,

15 "cache_read_input_tokens": 22296,← B

16 "output_tokens": 121

17 },← C

18 "structured_output": {

19 "name": "John Smith",

20 "age": 32,

21 "city": "New York"

22 }

23}

Aresult is empty -- Claude put everything in structured_output

BThe extracted data, conforming to your schema

CInteger type preserved -- not a string

Complex Schemas

Schemas can contain nested objects, arrays, enums, and description hints. Constrained decoding maintains high reliability across complex schemas, though it is not a hard guarantee at the API level.

Arrays of Objects

Nested Array Schema

$ claude -p "List the top 3 programming languages" \ --output-format json \ --json-schema '{"type":"object","properties":{"languages":{"type":"array","items":{"type":"object","properties":{"name":{"type":"string"},"paradigm":{"type":"string"},"year":{"type":"integer"}},"required":["name","paradigm","year"]}}},"required":["languages"]}'

{ "structured_output": { "languages": [ {"name": "Python", "paradigm": "multi-paradigm", "year": 1991}, {"name": "JavaScript", "paradigm": "multi-paradigm", "year": 1995}, {"name": "Java", "paradigm": "object-oriented", "year": 1995} ] }, ... }

Enum Constraints for Classification

Use enum to restrict a field to a fixed set of values. Combined with constrained decoding, this turns Claude into a deterministic classifier:

{
  "type": "object",
  "properties": {
    "sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"] },
    "confidence": { "type": "number" }
  },
  "required": ["sentiment", "confidence"]
}

Claude can only produce one of the three enum values for sentiment. There is no possibility of a hallucinated fourth option.

▸ Try This

Try constraining Claude’s output to a specific shape:

claude -p “Classify this text as positive, negative, or neutral: ‘The build failed again’” —output-format json —json-schema ’{“type”:“object”,“properties”:{“sentiment”:{“type”:“string”,“enum”:[“positive”,“negative”,“neutral”]},“confidence”:{“type”:“number”}},“required”:[“sentiment”,“confidence”]}‘

Does the output match the schema exactly? Try removing the enum constraint — what does Claude return for sentiment without it?

Description Hints

Add description fields to guide extraction without changing the schema structure. This is especially useful when field names alone are ambiguous:

{
  "type": "object",
  "properties": {
    "action_items": {
      "type": "array",
      "description": "List of concrete next steps mentioned in the meeting notes",
      "items": {
        "type": "object",
        "properties": {
          "task": { "type": "string", "description": "The specific action to take" },
          "owner": { "type": "string", "description": "Person responsible" },
          "deadline": { "type": "string", "description": "ISO 8601 date if mentioned, otherwise empty string" }
        },
        "required": ["task", "owner", "deadline"]
      }
    }
  },
  "required": ["action_items"]
}

Schema Design Tips

Use required liberally. Any field not listed in required may be omitted. Put everything in required unless you genuinely want optional fields.
Use enum for classification. Sentiment, severity, category — anything with a finite set of values belongs in an enum.
Use description to guide extraction. The model reads these hints during generation. A well-written description is often more effective than a longer prompt.
Keep schemas as flat as possible. Deep nesting costs more tokens to compile and generate. If your schema has more than three levels of nesting, consider restructuring.

`result` vs `structured_output`

When using --json-schema, the JSON envelope gains a new top-level field. Understanding which field to read is critical.

Which Field to Parse

Field	Contains	When to Use
`result`	Natural language response (may be empty)	Display to users as a human-readable summary
`structured_output`	Schema-conforming JSON object	Always parse this for data — it is the schema-conforming output

These two fields are independent. Claude may provide a text explanation in result AND structured data in structured_output, or result may be empty. The behavior is inconsistent between calls — sometimes both are populated, sometimes only structured_output. Your parser should always read structured_output for the data and treat result as an optional bonus.

In the basic entity extraction example above, result was an empty string. But in the array example, Claude provided both a human-readable summary in result and the structured array in structured_output:

array_schemaartifacts/08/array_schema.json

2 "type": "result",

3 "subtype": "success",

4 "is_error": false,

5 "duration_ms": 10344,

6 "num_turns": 2,

7 "result": "Here are the top 3 programming languages by popularity:\n\n1. **Python** (1991) - Multi-paradigm language dominant in AI/ML, data science, web development, and scripting.\n2. **JavaScript** (1995) - Multi-paradigm language that powers the web, running in browsers and on servers via Node.js.\n3. **Java** (1995) - Object-oriented language widely used in enterprise applications, Android development, and large-scale systems.",← A

8 "session_id": "7ca8e29b-7cd1-4832-a02a-8a2888808f0e",

9 "total_cost_usd": 0.076849,

10 "structured_output": {

11 "languages": [← B

12 {

13 "name": "Python",

14 "paradigm": "multi-paradigm",

15 "year": 1991

16 },

17 {

18 "name": "JavaScript",

19 "paradigm": "multi-paradigm",

20 "year": 1995

21 },

22 {

23 "name": "Java",

24 "paradigm": "object-oriented",

25 "year": 1995

26 }

27 ]

28 }

29}

Aresult is populated this time -- Claude chose to provide both

Bstructured_output is still the canonical data source

Constrained Decoding

Unlike prompt-based approaches (“respond in JSON format”), --json-schema uses grammar-based constrained decoding. At each token generation step, the model can only produce tokens that keep the output valid against the schema. This is not post-processing or validation — it happens during generation itself.

The practical implications:

Near-100% compliance — the output reliably matches the schema in practice, though unlike OpenAI’s strict: true mode, Claude’s constrained decoding is not a hard guarantee at the API level
No hallucinated fields — only declared properties appear in the output
Correct types — integers are integers, strings are strings, arrays are arrays
Required fields present — every field listed in required will be present
Enum enforcement — values are restricted to exactly the options you specify

This is what separates --json-schema from asking Claude to “please return JSON.” With prompt engineering, you get ~95% compliance and need to handle parse failures. With constrained decoding, you get near-100% compliance in practice.

--json-schema Is NOT a Hard Guarantee

Unlike OpenAI’s strict: true mode which enforces schema compliance at the token generation level with a formal guarantee, Claude’s —json-schema uses constrained decoding that is highly reliable but not formally guaranteed. In practice, compliance is near-100% for well-formed schemas, but edge cases (deeply nested schemas, conflicting constraints) can produce malformed output. Always validate the structured_output field in production pipelines rather than assuming perfect compliance.

Parsing the Wrong Field

We built an entire parser looking for response.result before realizing structured output lands in response.structured_output. Lost 2 hours. The result field sometimes has data, sometimes doesn’t — it’s unpredictable. Always parse structured_output when using —json-schema, and treat result as an optional bonus for human-readable summaries.

The tradeoff is that complex schemas cost more tokens. The schema itself gets compiled into a grammar, and a schema with 20 nested objects takes longer to compile and generates more output tokens than a flat 3-field schema. For most use cases the difference is negligible, but it is worth knowing if you are processing thousands of documents with a deeply nested schema.

Gotcha

Using —output-format text with —json-schema produces empty output. The structured_output field only exists inside the JSON envelope. If you use text format, there is no envelope to contain it, and result may be empty. Always pair —json-schema with —output-format json.

Gotcha

Escaped quotes in command-line schemas are finicky. Shells handle nested quoting differently, and a misplaced backslash will silently produce an invalid schema. For anything beyond simple flat objects, store the schema in a file and pass it with —json-schema ”$(cat schema.json)” or build the command in a script where you can use JSON.stringify() to handle escaping.

Tip

The num_turns field shows 2 even for simple extractions. Claude uses an internal turn for schema setup. This is normal and does not indicate an error or wasted work.

Use Cases

Structured output unlocks a category of workflows that are impossible with plain text responses:

Data extraction — pull structured fields from unstructured text (emails, PDFs, logs)
Classification — sentiment, category, priority level with enum constraints
Code review — structured review output with severity, file, line number, and suggestion
Form filling — extract form fields from natural language input
API response generation — guarantee your Claude-powered API returns valid, typed responses
Batch processing — extract the same fields from hundreds of documents with consistent output

For a complete parsing example in Node.js, including error handling and schema construction, see the source guide.

→ Now Do This

Create a schema for your most common extraction task — bug classification, log parsing, or code review findings — and run one prompt with —json-schema. Validate the output matches. This is how you turn Claude from a text generator into a structured data pipeline.