Model Routing¶
Effort-band → model resolution for the dev-team plugin. An agent declares the
reasoning effort its task needs (effort: low|medium|high); the plugin
maps that band to a concrete model at dispatch. The same code works on a
personal Anthropic API key, a corporate proxy with a restricted model
allowlist, and Bedrock or Vertex deployments — with zero environment-specific
config in the repo.
For the design rationale see ADR 0008 — Use effort bands instead of model names in agent frontmatter, which amends ADR 0004 — Pre-dispatch model resolution enforced by a PreToolUse hook. For operator-facing ladder authoring, see model-routing-overrides.md.
Architecture at a glance¶
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#dbeafe', 'primaryTextColor': '#1e3a5f', 'primaryBorderColor': '#3b82f6', 'lineColor': '#64748b', 'secondaryColor': '#f1f5f9', 'tertiaryColor': '#e0f2fe', 'background': '#ffffff', 'mainBkg': '#dbeafe', 'nodeBorder': '#2563eb', 'clusterBkg': '#eff6ff', 'clusterBorder': '#bfdbfe', 'titleColor': '#1e3a5f', 'edgeLabelBackground': '#f8fafc'}}}%%
flowchart LR
subgraph caller[Caller layer]
AF[Agent frontmatter<br/>effort: band]
end
subgraph harness[Claude Code harness]
AT[Agent tool dispatch]
end
subgraph plugin[Plugin enforcement surface]
HK[hooks/agent-model-resolve.sh<br/>PreToolUse, matcher Agent]
RS[hooks/lib/model-resolve.sh<br/>resolver helper]
end
subgraph state[Routing state]
RJ[(knowledge/<br/>model-routing.json<br/>default map, shipped)]
LD[(.claude/<br/>model-ladder.json<br/>per-env, gitignored)]
SM[(.claude/<br/>session-model<br/>captured, gitignored)]
BL[(.claude/metrics/<br/>model-routing.log<br/>bump events, JSONL)]
end
subgraph diag[Diagnostics]
MRC["/model-routing-check"]
SB["hooks/session-model-banner.sh<br/>SessionStart"]
end
AF --> AT
AT -.intercepted by.-> HK
HK --> RS
RS --> RJ
RS --> LD
HK -. session fallback .-> SM
HK -- bump --> BL
HK -- updatedInput --> AT
MRC --> RS
MRC -. tail .-> BL
SB -- write --> SM
The hook is the only file the harness touches at dispatch time. Everything else is either input (routing.json, ladder), context (session-model), output (bump log), or read-only diagnostics.
Dispatch flow¶
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#dbeafe', 'primaryTextColor': '#1e3a5f', 'primaryBorderColor': '#3b82f6', 'lineColor': '#64748b', 'secondaryColor': '#f1f5f9', 'tertiaryColor': '#e0f2fe', 'background': '#ffffff', 'mainBkg': '#dbeafe', 'nodeBorder': '#2563eb', 'clusterBkg': '#eff6ff', 'clusterBorder': '#bfdbfe', 'titleColor': '#1e3a5f', 'edgeLabelBackground': '#f8fafc'}}}%%
sequenceDiagram
autonumber
participant LLM as Orchestrator LLM
participant H as PreToolUse hook
participant R as model-resolve.sh
participant FS as routing.json + ladder
participant Log as bump log
participant CC as Claude Code harness
LLM->>CC: Agent(subagent_type: x)
CC->>H: stdin: tool_input
H->>H: read effort band from agents/x.md
H->>R: model-resolve.sh <band>
R->>FS: read routing.json (+ ladder if present)
alt no ladder (default map)
R-->>H: stdout: default snapshot for the band
H-->>CC: updatedInput.model = snapshot (no bump logged)
else ladder maps the band to a non-default model
R-->>H: stdout: ladder model
H->>Log: append JSONL bump event
H-->>CC: updatedInput.model = ladder model
else explicit out-of-ladder snapshot
H->>Log: append session-fallback event
H-->>CC: updatedInput.model = session model
end
CC->>CC: dispatch with the resolved model
There is no deny branch. The hook always rewrites tool_input.model for
an effort-bearing agent (migrated agents carry no model: of their own) and
fails open (pass-through) on any error — a missing routing.json or an
unreadable agent file never blocks dispatch. Per-dispatch resolution is
silent; bumps are logged to disk for /model-routing-check.
Contract¶
Each agent declares effort: low|medium|high in its YAML frontmatter. The
PreToolUse hook hooks/agent-model-resolve.sh, registered in settings.json
under matcher: "Agent", intercepts every sub-agent dispatch, strips any
<plugin>: prefix from subagent_type, reads the agent's effort band, and
resolves it to a concrete model before the harness sees the call.
Resolution inputs:
knowledge/model-routing.json(shipped): the band → snapshot default map (low/medium/high) plus legacyhaiku/sonnet/opuskeys retained for the deprecation window, and the pinned ladderroundingconvention. The default map equals the pre-migration tier mapping, so zero-config behavior is unchanged..claude/model-ladder.json(per-environment, gitignored): an optional, capability-ascending JSON array of the models that environment has. When present and valid it overrides the default map..claude/session-model(captured at session start, gitignored): the model the session began on. Used as the fallback when a requested explicit snapshot is unavailable, and as the reference for the SessionStart banner's upgrade flags. Never a ceiling.
Resolution precedence¶
- Valid ladder →
index = round_half_up(weight·(N−1))with weightslow=0,medium=0.5,high=1, indexing into the ladder array. - Shipped default map →
routing.json[band](used when there is no ladder, or the ladder is malformed/empty — a bad ladder never aborts dispatch). - Session-model fallback → only for an explicit snapshot a present ladder does not contain (the requested model is unavailable in this environment).
Worked examples (round_half_up, so N=4 medium lands on index 2):
| Ladder | low | medium | high |
|---|---|---|---|
[haiku, sonnet, opus] (N=3) |
haiku | sonnet | opus |
[sonnet, opus] (N=2) |
sonnet | opus | opus |
[sonnet] (N=1) |
sonnet | sonnet | sonnet |
[haiku, sonnet, opus, ultra] (N=4) |
haiku | opus | ultra |
The N=3 case reproduces today's haiku/sonnet/opus mapping exactly.
Exit-code taxonomy¶
The resolver helper hooks/lib/model-resolve.sh:
| Code | Meaning |
|---|---|
| 0 | Resolved successfully |
| 2 | Unknown band/tier or missing argument (caller error) |
| 4 | knowledge/model-routing.json missing |
The legacy deny-relevant codes (3 exhausted/cycle, 5 malformed overrides) are no longer reachable: a band always resolves once routing.json is present, and a bad ladder degrades to the default map. The hook maps any non-zero resolver exit to pass-through (fail-open), never deny.
Legacy tier acceptance (deprecation window)¶
An agent that still declares a legacy model: haiku|sonnet|opus (or passes one
as tool_input.model) resolves tier → band (haiku→low, sonnet→medium,
opus→high) for this release and is logged with reason=legacy-tier.
/agent-audit warns on the deprecated tier and names the band to use. This
release warns, never errors; the next major removes legacy acceptance.
The bump log¶
.claude/metrics/model-routing.log records one JSONL event per dispatch where
the resolved model differs from the band's shipped default (a ladder override,
upgrade, or downgrade), always for a legacy-tier dispatch, and for a
session-model fallback. A resolution equal to the default rewrites the model
but logs nothing. Schema:
{"ts": "2026-06-21T12:00:00Z", "band": "high", "served": "claude-sonnet-4-6", "reason": "effort", "caller": "security-review", "session": "claude-opus-4-8"}
/model-routing-check tails this log (read-only). reason is one of
effort (ladder bump), legacy-tier, or session-fallback.
Authoring a ladder (restricted endpoints)¶
When an environment (Bedrock, Vertex, a corporate proxy) offers only a subset
of models, hand-write .claude/model-ladder.json as a capability-ascending
array of the model IDs that environment has:
With this ladder, low → sonnet, medium/high → opus. Verify with
/model-routing-check — its "Effective band → model map" reflects the ladder,
and when no ladder exists it prints a ready-to-edit starter seeded from the
defaults. Delete the file to restore the shipped default map. There is no
"disable" flag — absence of the ladder is the disabled state. For the full
schema and more worked ladders, see
model-routing-overrides.md.
Adding a new effort band¶
The three bands (low/medium/high) and their weights are the single source of
truth. If a fourth band is ever warranted (e.g. a common 4+ model ladder), add
it in lockstep:
- Extend
ALLOWED_BANDSintests/agents/agent_effort_frontmatter_tests.batsand the weight table inhooks/lib/model-resolve.sh(_band_weight) andhooks/agent-model-resolve.sh(_normalize_band). - Add the band key to
knowledge/model-routing.json. - Update the band → model dump in
_dump_map, this file, and the spec.
Environment variables¶
User-facing:
ANTHROPIC_BASE_URL— standard Claude Code variable (Bedrock/Vertex/proxy detection is no longer needed for routing; the ladder is endpoint-agnostic).MODEL_BUMP_TAIL— how many bump events/model-routing-checkprints (default10).
Test-only injection seams — do not set these in normal use:
MODEL_ROUTING_JSON— path to the shipped routing defaults.MODEL_LADDER_JSON— path to the per-environment ladder.SESSION_MODEL_FILE— path to the captured session model.MODEL_BUMP_LOG— path to the bump log.MODEL_AGENTS_DIR— path to the agents directory the hook reads.
These exist so bats tests can isolate filesystem state without touching the
real .claude/ directory. Setting them at runtime is not supported.