Skip to main content

Model · June 20, 2024

Anthropic ships Claude 3.5 Sonnet at the Opus price point

Claude 3.5 Sonnet beats Opus on graduate-level reasoning and code generation while priced like a mid-tier model. Anthropic frames this as 'better, cheaper, faster' all at once.

Reported byAnthropic

new-model.svg

What happened

Anthropic released Claude 3.5 Sonnet, a mid-tier model priced at $3 per million input tokens and $15 per million output. It outperforms the previous flagship, Claude 3 Opus, on graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and HumanEval code generation. Anthropic also shipped 'Artifacts' — a side-panel canvas for rendered output.

The pricing is the most consequential part. Sonnet is 1/5 the cost of Opus for superior benchmark scores. Anthropic is signalling that the cost curve of frontier-quality retrieval is falling fast.

Why it matters to Shopify merchants

When the best model gets cheaper, AI-assistant surfaces become more ambitious. Claude is now cheap enough that Anthropic's own shopping-assistant experiments, and downstream products built on the Claude API, can afford deeper retrieval per query. Deeper retrieval means more citations per answer — and Shopify merchants cited in those answers sit in the funnel.

Better reasoning also raises the bar for what content makes the citation cut. Claude 3.5 Sonnet is better at discarding weak sources. Catalogs with thin descriptions, missing schema, or hallucination-prone content get filtered out more aggressively. The asymmetric payoff is to the stores whose pages are parseable, factual, and clean.

For merchants, the tactical move is to make sure every product page answers the specific questions buyers ask before buying — materials, fit, comparisons to named alternatives, return policy in one sentence. Claude 3.5 Sonnet rewards that density; thin product pages get skipped.

How we got here
  1. Mar 2024

    Claude 3 Opus ships

    Flagship reasoning model at $15/$75 per 1M input/output tokens.

  2. Jun 2024

    Claude 3.5 Sonnet ships

    Beats Opus on GPQA + HumanEval at $3/$15 per 1M tokens — 1/5 the cost.

  3. Jun 2024

    Artifacts canvas goes GA

    Side-panel canvas for rendered output, shifting shopping UX toward inline product cards.

  4. Q3 2024

    Sonnet API adoption surge

    Third-party agentic shoppers migrate from Opus to Sonnet en masse.

Three questions, answered

Is Claude 3.5 Sonnet used by any shopping surfaces today?
Yes — Anthropic's own Claude shopping experiments use it, and third-party apps (including agentic shopping assistants) have migrated from Opus to Sonnet for cost reasons. If you already optimised for Claude Opus, your work carries over cleanly.
Does a better Claude model change what I should write in a product description?
It reinforces the same priorities. Specific details, named comparisons, honest limitations, and schema-backed facts. The model is better at punishing vague or inflated copy, so factual density pays off more than before.
What's the Surfient-specific implication?
Our GEO Audit rubric (catalog factuality + schema coverage + answer-block density) is exactly the rubric Claude 3.5 Sonnet uses internally to pick which sources to cite. Scoring well on a Surfient audit is directly correlated with showing up in Claude-powered answers.