If you are paying for Claude in 2026, you are probably paying too much for the tier you are picking. The Opus headline draws the eye, the benchmark charts look great, and most teams default to it without ever asking whether Sonnet would have done the same job for a small fraction of the bill. We run Cut The SaaS, we use both tiers daily across writing, code review, and customer work, and nobody at Anthropic pays us anything. The honest take below maps which tasks earn the Opus premium and which tasks should stay on Sonnet.
The shortest version is the one Anthropic puts in their own model-selection guidance: Sonnet is the recommended default for general work, Opus is for the cases where Sonnet underperforms. That is not a soft recommendation, it is a budget warning if you read it as one.
◢Is Claude Opus actually better than Sonnet?
On the hardest benchmarks, yes. On the work most teams actually do, the gap closes fast. Opus 4.8 leads on dense reasoning, long-horizon agentic coding, and specialized domains like advanced biology and security analysis, per Anthropic's model overview. Sonnet 4.6 is the recommended default for the working middle: writing, coding, summarization, multi-step workflows, customer support, and analysis, per Anthropic's choosing-the-right-Claude tutorial.
The trick most teams miss is that "hardest benchmarks" and "most of your real prompts" are very different categories. Run an honest audit of last week's AI usage. If 80% of your prompts are summaries, code suggestions, customer replies, and drafts, you do not have an Opus problem. You have a Sonnet underuse problem.
◢How much cheaper is Sonnet than Opus?
A lot. The exact ratio shifts release to release, but Sonnet is consistently positioned as the cost-efficient workhorse and Opus as the premium tier, per Anthropic's pricing page. For a sense of scale: if Opus 4.8 sits around the $5 input, $25 output per million tokens band (Fable 5 at $10/$50 is exactly 2x Opus, per Anthropic's Fable launch), Sonnet is several times cheaper still on the same axis. Compounded across a real workload, the difference is usually larger than the entire bill of one of the SaaS tools you are already trying to cut.
The lesson is not that Opus is overpriced, it is that using Opus for tasks Sonnet handles is a quiet, recurring overcharge. The fix is a tiering policy, not a platform switch.
◢When does Opus actually earn its price over Sonnet?
Three real cases, and they are narrower than the marketing implies. First, dense reasoning where the model needs to hold many constraints in its head at once. Complex legal analysis, multi-hypothesis scientific reasoning, intricate financial modeling. If you give the same prompt to both and Sonnet gives a confident-but-shallow answer while Opus gives the right one, that is Opus territory.
Second, long agentic coding tasks. When an AI is doing multi-step work on its own (planning, editing, running tests, iterating) over hours, the small per-step quality gap between Sonnet and Opus compounds into a meaningfully different end result, as Simon Willison demonstrated in his Fable 5 review (Opus 4.8 sits between Sonnet and Fable on this dimension).
Third, specialized hard domains: advanced biology, security analysis, deep cryptography. These are the cases where Anthropic explicitly steers users to Opus or higher.
If your work does not look like one of those three buckets, the case for paying Opus prices weakens fast.
◢When should you stay on Sonnet 4.6?
For the bulk of real founder work. Writing, even long-form, lands on Sonnet at a quality most readers cannot distinguish from Opus. Coding that is human-in-the-loop, file-at-a-time, drafting and reviewing, runs fine on Sonnet. Summarization of meetings, documents, and conversations is a Sonnet-tier job. Customer support and ops, where speed and consistency matter more than novel reasoning, are Sonnet tasks by design. Multi-step workflows with reasonable handholding are exactly what Sonnet was tuned for, per the choosing-the-right-Claude tutorial.
The mistake is treating Opus as the "safe" choice. The actual safe choice is the tier whose strengths match the job. If your reflex is to reach for Opus because "the better model can't hurt," you have rebuilt the SaaS-stack-bloat problem inside your AI bill, where it grows even faster than a subscription would.
◢How do you actually pick the right tier for each job?
Run the audit before you change anything. Pick three concrete tasks you do every week, send each to both Sonnet and Opus on identical prompts, and look at the outputs side by side. For two of those three tasks, you will very likely find Sonnet's answer is indistinguishable from Opus's. That is the evidence you need to move them to Sonnet permanently.
For the one task where Opus clearly wins, keep it on Opus and stop second-guessing the bill. That is what tiered usage looks like in practice: cheap default, smart escalation, no reflexive premium-tier defaulting. While you are tightening the Claude side, the same logic applies to the Fable 5 vs Opus 4.8 question one tier up, and to the Claude vs ChatGPT split: pick by job, not by marketing.
The teams paying the lowest AI bills in 2026 are not the ones using the cheapest models. They are the ones who matched the tier to the task on purpose, then stopped overpaying for capability they were not using.