Subagent model tiering — Glossary

Subagent model tiering is the discipline of matching the model to the work. A grep across a 200-file solution does not need Opus; a planning pass that decides which migration path to commit to does. The CLAUDE_CODE_SUBAGENT_MODEL environment variable and opusplan mode are the official knobs; this entry catalogues how Code Majesty configures them.

The configuration is one environment variable plus per-agent overrides:

# default all subagent traffic to Sonnet unless an agent specifies otherwise
export CLAUDE_CODE_SUBAGENT_MODEL=claude-sonnet-5

combined with model: frontmatter in .claude/agents/*.md for individual agents, and opusplan mode so Opus runs the planning pass while implementation continues on Sonnet. What we send where:

Tier	Work	Why
Haiku	Searches, file inventory, symbol lookups, log scans	Volume is high, reasoning need is near zero
Sonnet	Implementation, test writing, refactors inside a verified plan	The plan already carries the hard decisions
Opus	Planning passes, migration-path decisions, failure post-mortems	Errors here are the expensive ones

The observable effect: per-PR token cost becomes a function of plan size rather than session length, because the long mechanical middle of a session runs on the cheap tiers.

When to use

Any agent setup where token spend is non-trivial and the team wants predictable per-PR cost.
Sessions long enough that the marginal reasoning gain of Opus on every tool call is not worth the budget hit.

Distinguish from

A flat single-model setup is simpler to operate but spends Opus tokens on Haiku-grade work. Tiering needs the upfront config but typically pays back within the first long session.

When to use

Distinguish from

Related terms