Opus 4.8 and Workflows - One Careful Pass Is No Longer the Default

Anthropic shipped Opus 4.8 and Dynamic Workflows on the same day. Together they move the unit of agentic work from one model call to dozens of verified ones.

May 29, 2026

Anthropic shipped Opus 4.8 yesterday. The model bump alone is not the story.

The story is that two other things shipped beside it. A new Claude Code primitive called Dynamic Workflows, which lets Claude write its own orchestration scripts and fan out to dozens of parallel subagents with adversarial verifiers built in. And a 3x cut to Opus Fast pricing, from $30/$150 per million tokens down to $10/$50, roughly 2.5x faster on top of it.

Those three changes are the same change. Anthropic just repositioned the unit of agentic work, and the pricing finally allows what the tooling implies.

What’s actually in 4.8

The model card is doing the usual benchmarks-go-up dance, but the prompting and runtime changes are where the day-to-day differences land.

The biggest is the move to adaptive reasoning only. MAX_THINKING_TOKENS is now ignored, and CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING is gone. The model decides how hard to think. To pin behavior you use the new /effort slash command (or --effort flag) across low / medium / high / xhigh / max. Claude Code defaults to xhigh for coding work; claude.ai and Cowork default to high. For a one-off deep pass without raising the whole session’s effort, drop the literal word ultrathink into the prompt and that turn alone reasons harder.

Four new or refreshed slash commands round it out. /ultrareview runs a senior-engineer review pass over the diff. /simplify does a refinement pass on recently modified code. /focus hides intermediate work and shows only the final output. /fewer-permission-prompts scans the session and writes a safer allowlist into settings.json so the harness stops interrupting you on read-only bash and MCP calls.

The Max plan gets the 1M context window by default, and the Fast mode price cut to $10 in / $50 out per million tokens makes the difference between Fast and default Opus closer to a latency choice than a budget one. There is also a research-preview Auto mode behind Shift+Tab that auto-approves safe actions and pauses on risky ones, aimed at long-running tasks where you want to walk away.

None of that is revolutionary by itself. The change in posture is.

What Dynamic Workflows actually is

A Dynamic Workflow is a JavaScript orchestration script that the model writes for itself. The script does not call the model directly. It calls four primitives that the harness wires into the session: agent() spawns a subagent and returns its result, parallel() fans tasks out concurrently with a barrier, pipeline() streams items through multiple stages without barriers between them, and phase() groups subagent calls under a progress label.

Two activation modes ship with it. Explicit — you say “create a workflow to audit this codebase for X” and Claude designs and runs the script. Implicit — you flip the ultracode setting on and Claude evaluates every task as a workflow candidate, reaching for fan-out instead of single passes by default. Ultracode is off out of the box, and the docs are clear about why. It burns tokens fast.

The interesting part is what gets baked in. Schema validation through a structured-output tool means subagents return validated objects, not strings you have to parse. Workflows resume from a prior runId, so an edit to your script doesn’t re-run the agents that didn’t change. There is a 1,000-agent lifetime cap per workflow as a runaway-loop backstop. And the documented “quality patterns” — adversarial verify, judge panel, loop-until-dry, multi-modal sweep, completeness critic — show Anthropic’s own hand on what good fan-out looks like.

The most telling pattern is the adversarial verify. Spawn three independent skeptics per finding, prompt each to refute it, kill the finding if a majority succeed. Anthropic is not selling fan-out as more answers. They are selling it as more checks.

This is the part that changes how you build.

The price math is the whole story

The orchestration-first pattern has been technically possible for a year. LangGraph wired it up. CrewAI wired it up. The reason almost nobody runs it as a default is the bill. Fifty parallel subagents on Opus 4.7 Fast was $30 in and $150 out per million tokens. A serious review pass on a real pull request would burn through a tank of compute and produce something a senior engineer could have written by hand in less time.

Opus 4.8 Fast is $10 in and $50 out. The same fan-out is now a third of the cost and 2.5x faster.

That is the number that changes behavior. Iteration loops that used to be run this once, pray it found the bug become run it three times with different angles and trust the intersection. Discovery sweeps that used to ship as a single grep become five finders with different lenses, deduped at the union. Verifier panels that used to be cosplay become a default.

Fast at $10/$50 makes the fifty-agent review pass look like a Tuesday, not a stunt.

What this looks like in practice

I have been running a version of this pattern on my own homelab since February. Eight named agents probe their own state in parallel, a separate evaluator scores their last forty-eight hours of output against a scope file, safe idempotent fixes auto-apply, and only the human-decision items surface for me. The expensive part was never the orchestration. It was the cost of getting eight independent passes to cohere before I trusted the verifier.

Anthropic’s own review-changes example shows the same shape with the rough edges sanded off. Dimensions fan out: bugs, performance, security, reuse, tests. Each dimension yields findings. Each finding is handed to a panel of independent skeptics whose prompt is try to refute this. A finding survives only if a majority of skeptics fail to refute. It is the same trick a good engineering org runs at code review, ported into the model layer and budgeted in tokens instead of senior-engineer hours.

What’s oversold

Two honest caveats.

The new skill being priced is not the orchestration script. It is the verifier. A fan-out that finds fifty plausible bugs and forwards all fifty is worse than the single careful pass it replaced, because it shifts the verification burden onto a human who now has to triage noise instead of read code. The workflows post handwaves the verifier as a prompt. Most builders I know do not have a verifier discipline yet, and the tooling for one is thin.

And ultracode, the setting that makes Claude reach for a workflow on every task without being asked, ships off by default. Anthropic’s documentation flags why. Fan-out burns quota fast, and there is no governance layer that says do not orchestrate this task. The dispatch problem, deciding when a workflow is the right shape and when a single careful pass is, is unsolved. Right now it lives in your head.

Why this is worth watching anyway

Two reasons.

The agentic frameworks built between 2024 and now (LangGraph, CrewAI, Autogen) were filling the gap where the model vendor did not ship orchestration. That gap just closed. The bridge between one model call and agentic system is now a tool the model writes for itself, in JavaScript, with caching and resume baked in. Whatever those frameworks were going to charge for over the next eighteen months, the ceiling on that price just dropped.

And the orchestration-first pattern was already where AI engineering was heading. Evals are the new bottleneck because at some point the question stops being is the model good enough and starts being how confident am I that this particular answer is right. Workflows give you a vocabulary for spending compute on that second question instead of just the first. The vendor that ships the verifier primitive first sets the pattern everyone else copies.

If the price of careful goes down, the price of casual goes up. The default question stops being is the model good enough yet. It becomes who is verifying.

Around the Corner: short reviews of ideas worth watching. Opt-in section, not part of the weekly Run Data Run email. Subscribe to the main list for longer essays.

Run Data Run

Discussion about this post

Ready for more?