"Which Claude model should I use" is one of the most common questions we get in 2026, and the answer is almost always cheaper than the asker expects. Anthropic publishes the answer in their own model-selection guidance, but most teams either have not read it or do not believe it: Sonnet is the default; Opus is for when Sonnet underperforms; Fable is for the narrow cases where Opus fails. We run Claude across our own stack at Cut The SaaS, nobody at Anthropic pays us anything, and below is the practical decision tree that keeps your Claude bill honest.
The short version: start on Sonnet 4.6. Escalate only on evidence. Most teams escalate reflexively and overpay.
◢Which Claude model should I use for general work?
Sonnet 4.6. This is not our opinion; it is Anthropic's own recommendation. Sonnet is the cost-efficient default for writing, coding, debugging, customer support, summarization, and multi-step workflows. The quality gap to Opus 4.8 on routine tasks is small and the price gap is large enough to change your monthly line item.
The mistake most teams make is treating Opus as the safe choice. The actual safe choice is the tier whose strengths match the job. If you default to Opus because "the better model can't hurt," you have rebuilt the SaaS-bloat problem inside your AI bill, where it grows faster than a subscription ever could. We covered the full Sonnet vs Opus split in Claude Opus vs Sonnet.
◢When should you actually use Claude Opus 4.8?
Three real cases, and they are narrow. Dense reasoning where the model needs to hold many constraints simultaneously: complex legal analysis, multi-hypothesis scientific reasoning, intricate financial modeling. Long-horizon agentic coding where the per-step quality gap compounds across hours of work, per Simon Willison's tests. Specialized hard domains like advanced biology, security analysis, and deep cryptography, per Anthropic's model overview.
If your work does not fit one of these three buckets, the case for paying Opus prices over Sonnet weakens fast. Run the side-by-side: send three real tasks from your week to both, compare outputs, and only move the ones where Opus clearly wins. The rest stays on Sonnet permanently.
◢When should you use Claude Haiku?
For simple, high-volume work where speed and cost matter more than peak reasoning. Classification, entity extraction, routine completion, summarization of short content, structured-output parsing on simple inputs. Haiku 4.5 is dramatically cheaper than Sonnet and meaningfully faster, per Anthropic's pricing.
The pattern that wastes money is running Sonnet on workloads where Haiku would do the same job. Bulk classification of thousands of records is a Haiku job, not a Sonnet job; running it on Sonnet by default is the same kind of overspend as running it on Opus.
◢Is Fable 5 worth it over Opus 4.8?
Almost never. Fable 5 costs twice as much per token, runs slower, falls back to Opus 4.8 on roughly 5% of sessions anyway, and the quality gap is real only on genuinely long agentic tasks, per Anthropic's Fable launch. For most teams, the right move is to keep Sonnet as the default, run Opus on the hard jobs, and ignore Fable until you can point to a specific workload Opus is failing on.
We covered the full Fable case in Fable 5 vs Opus 4.8. The piece is short for a reason: there is not much to say once you do the math.
◢How do you actually pick the right Claude model in practice?
Three steps. Default to Sonnet 4.6 on every new workload. Run the audit: pick three real tasks you do every week, send each to both Sonnet and Opus on identical prompts, and only move the ones where Opus visibly wins. Audit your bill by model tier monthly: Anthropic's usage dashboard breaks spend down by model, and that is where you spot reflexive escalation.
Layer the prompt caching docs on top for any workload with stable context, and the typical Claude bill drops by half or more without losing capability. The teams paying the lowest Claude bills in 2026 are not the ones using the cheapest models. They are the ones who matched the tier to the task on purpose, then stopped overpaying for capability they were not using. For the full bill-control playbook, see Claude API Pricing.