CI/CD Integration

GitHub Actions, custom pipelines, retry logic, and CI flags

Claude Code runs headless in CI/CD pipelines for automated code review, test generation, documentation, and more. The official GitHub Action provides turnkey integration, while the CLI’s --print mode enables custom pipeline workflows with explicit cost caps and retry logic.

GitHub Action

The anthropics/claude-code-action@v1 action wraps the CLI with sensible CI defaults. It auto-detects whether it is running on a PR or issue event, injects the diff context, and posts Claude’s response as a comment.

GitHub Action — PR Review Workflow
$

The action handles checkout, context injection, and comment posting automatically. You supply a prompt, your API key, and an optional budget cap. For more control, you can install the CLI directly and run it yourself.

Custom CLI Pipeline in Actions
$

CI Flags

Every CI invocation needs a specific set of flags to run safely in a headless environment. Here is the breakdown.

Essential CI Flags

FlagPurposeWhy It Matters in CI
-p / —printHeadless modeRequired — no interactive UI in CI
—output-format jsonMachine-parseable outputLets you extract cost, session ID, and result with jq
—max-budget-usd NCost cap per invocationPrevents runaway spend — costs can vary 10x for the same prompt
—no-session-persistenceDon’t save session to diskKeeps ephemeral CI runners clean — no leftover session files
—permission-mode bypassPermissionsNo interactive promptsWithout this, the CLI hangs waiting for TTY input and eventually times out
—strict-mcp-configFail if any MCP server can’t connectWithout it, a failed MCP connection is silently ignored — subtle, dangerous
—fallback-model MODELUse alternate model on overloadOnly triggers on HTTP 429 rate limiting, not other failures
—from-pr URLResume session linked to a PRRequires GITHUB_TOKEN in the environment

The minimum viable CI invocation combines four of these flags:

Terminal window
claude -p "Review this code for bugs" \
--output-format json \
--max-budget-usd 0.50 \
--no-session-persistence \
--permission-mode bypassPermissions

Custom Pipelines

Beyond GitHub Actions, you can embed Claude in any CI system — GitLab CI, Jenkins, CircleCI, or a plain bash script. The pattern is the same everywhere: capture JSON output, parse with jq, and act on the result.

Cost-Capped PR Review:

Terminal window
claude -p "Review the changes in this PR" \
--from-pr "$PR_URL" \
--output-format json \
--max-budget-usd 0.50 \
--no-session-persistence \
--permission-mode bypassPermissions

Batch File Processing:

Terminal window
for file in src/**/*.ts; do
claude -p "Add JSDoc comments to $file" \
--output-format json \
--max-budget-usd 0.10 \
--no-session-persistence \
--permission-mode bypassPermissions
done

Each invocation is fully isolated — no session state leaks between files.

Plan-Review-Execute in CI:

A three-step workflow: generate a plan, post it for human review, then execute on approval.

Terminal window
# Step 1: Generate plan
PLAN_RESULT=$(claude -p "Fix all lint errors in src/" \
--permission-mode plan \
--output-format json \
--max-budget-usd 0.50)
# Step 2: Post plan as PR comment
PLAN=$(echo "$PLAN_RESULT" | jq -r \
'.permission_denials[] | select(.tool_name == "ExitPlanMode") | .tool_input.plan')
gh pr comment "$PR_NUMBER" --body "$PLAN"
# Step 3: On approval, execute
SESSION=$(echo "$PLAN_RESULT" | jq -r '.session_id')
claude -p "yes, proceed" \
--resume "$SESSION" \
--permission-mode bypassPermissions

In plan mode, Claude’s proposed changes appear as permission_denials with tool_name: "ExitPlanMode". The plan text lives in the tool_input.plan field.

Retry Patterns

The --fallback-model flag only handles 429 overload errors. For other transient failures — network errors, timeouts, API 500s — you need explicit retry logic.

Terminal window
retry_claude() {
local max_retries=3 prompt="$1"; shift
for attempt in $(seq 1 "$max_retries"); do
if RESULT=$(claude -p "$prompt" "$@" 2>/dev/null); then
if echo "$RESULT" | jq -e '.is_error != true' >/dev/null 2>&1; then
echo "$RESULT"
return 0
fi
fi
echo "Attempt $attempt/$max_retries failed, retrying in $((attempt * 5))s..." >&2
sleep "$((attempt * 5))"
done
echo "All $max_retries attempts failed" >&2
return 1
}

The function validates two things on each attempt: that the CLI process exited successfully, and that the JSON response does not have is_error: true. Backoff increases linearly — 5 seconds, 10 seconds, 15 seconds.

For Node.js pipelines, the same pattern translates directly:

async function retryClaudeSync(args, maxRetries = 3) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
const output = execFileSync('claude', args, {
encoding: 'utf-8', timeout: 120_000,
env: { ...process.env, CLAUDECODE: '' }
});
const data = JSON.parse(output);
if (!data.is_error) return data;
} catch (err) { /* retry */ }
if (attempt < maxRetries) {
await new Promise(r => setTimeout(r, attempt * 5000));
}
}
throw new Error(`All ${maxRetries} attempts failed`);
}

Fallback Models

The --fallback-model flag provides a backup when the primary model is rate-limited:

Terminal window
claude -p "Review this PR" \
--fallback-model claude-haiku-4-5-20251001 \
--max-budget-usd 0.50

There is a critical limitation here. Fallback only triggers on HTTP 429 (rate limiting). It does not trigger on other failures:

Fallback Trigger Matrix

Failure TypeTriggers Fallback?Your Mitigation
Rate limiting (429)Yes—fallback-model handles this automatically
API error (500)NoUse retry logic with backoff
TimeoutNoUse retry logic with backoff
Auth failure (401/403)NoFix credentials — retries will not help

This means --fallback-model is not a general resilience mechanism. Combine it with the retry function above for full coverage.

CI Response Payload

Here is a real response from a CI run with --no-session-persistence, --max-budget-usd 0.50, and --output-format json.

CI Run — Budget Cap + No Persistenceartifacts/14/ci_flags_test.json
1{
2 "type": "result",
3 "subtype": "success",A
4 "is_error": false,
5 "duration_ms": 1513,
6 "duration_api_ms": 1497,
7 "num_turns": 1,
8 "result": "4",B
9 "stop_reason": "end_turn",E
10 "session_id": "94d02fdc-9f01-4b5c-8bc9-8f08e684d8a3",C
11 "total_cost_usd": 0.016079,D
12 "usage": {
13 "input_tokens": 3,
14 "cache_creation_input_tokens": 1410,
15 "cache_read_input_tokens": 14253,
16 "output_tokens": 5,
17 "server_tool_use": {
18 "web_search_requests": 0,
19 "web_fetch_requests": 0
20 },
21 "service_tier": "standard"
22 },
23 "modelUsage": {
24 "claude-opus-4-6": {
25 "inputTokens": 3,
26 "outputTokens": 5,
27 "cacheReadInputTokens": 14253,
28 "cacheCreationInputTokens": 1410,
29 "costUSD": 0.016079,
30 "contextWindow": 200000,
31 "maxOutputTokens": 32000
32 }
33 },
34 "permission_denials": [],
35 "fast_mode_state": "off"
36}
AAlways check this first in CI — false means the call succeeded
BThe actual output text — extract with jq -r '.result'
CPresent even with --no-session-persistence, but not resumable after exit
DAggregate this across pipeline steps for total CI run cost
Eend_turn means Claude finished naturally — watch for max_tokens if truncated

Key fields to parse in your CI scripts:

  • is_error / subtype — check for "error_max_budget_usd" to detect budget exceeded
  • total_cost_usd — aggregate across pipeline steps for total CI run cost
  • result — the actual output text
  • session_id — present even with --no-session-persistence (valid during process lifetime only)
  • stop_reason"end_turn" means normal completion; other values indicate interruption

Parsing the Result in Bash

#!/usr/bin/env bash
set -euo pipefail
RESULT=$(claude -p "Review this diff for bugs" \
--output-format json \
--max-budget-usd 0.50 \
--no-session-persistence \
--permission-mode bypassPermissions)
# Check for budget exceeded
SUBTYPE=$(echo "$RESULT" | jq -r '.subtype // ""')
if [ "$SUBTYPE" = "error_max_budget_usd" ]; then
echo "::error::Claude exceeded budget cap"
exit 1
fi
# Extract result and cost
REVIEW=$(echo "$RESULT" | jq -r '.result')
COST=$(echo "$RESULT" | jq -r '.total_cost_usd')
echo "Review cost: $COST"
echo "$REVIEW"

The ::error:: prefix is GitHub Actions syntax for surfacing errors in the workflow summary. Replace it with your CI system’s equivalent.

Gotcha

—max-budget-usd is checked between turns, not mid-generation. The first turn can exceed the budget by any amount. For Opus, system prompt cache creation alone costs ~$0.016, so setting the budget below that guarantees an error_max_budget_usd response with no useful output.

Gotcha

If ANTHROPIC_API_KEY is missing or invalid, the CLI exits with a non-zero code but may not produce JSON output. Your pipeline should check the exit code before attempting to parse JSON, or the jq step will fail silently and the CI job may report success when it actually did nothing.

Tip

SessionStart and SessionStop hooks do not fire in —print mode. Only PreToolUse, PostToolUse, and Stop hooks fire in headless mode. If your audit logging depends on SessionStart, it will silently skip in CI.