Anthropic just launched ultrareview, a multi-agent code reviewer for Claude Code

Apr 23

What actually shipped

Anthropic just released ultrareview, a new /ultrareview command inside Claude Code that runs a deep, multi-agent code review in the cloud instead of on your machine.

The core idea is simple. Instead of one pass of Claude reading your diff locally, a fleet of reviewer agents runs in a remote sandbox, each looking at the change from a different angle, and every bug they report is independently reproduced before it shows up in your inbox.

It is a research preview in Claude Code v2.1.86 and later, and it is available on Pro, Max, Team, and Enterprise plans.

The basics in plain language

You run it from the CLI.

/ultrareview

That reviews the diff between your current branch and the default branch, including uncommitted and staged changes. If you want to review a specific pull request instead, you pass the number.

/ultrareview 1234

A review takes 10 to 20 minutes and runs in the background, so you keep working while it grinds away in the cloud. When it finishes, the verified findings show up as a notification in your session, each with a file location and an explanation you can hand back to Claude to fix.

Pro and Max users get three free runs as a one-time trial. After that, ultrareview bills against extra usage, not against your normal plan limits. Team and Enterprise plans have no free runs and go straight to extra usage. You have to enable extra usage in billing settings before you can launch a paid review.

Why this is a meaningful shift

Code review with AI has been mostly a single-shot thing so far. You paste a diff, the model tells you what looks off, you fix the obvious stuff and move on. That works, but it misses a lot, and it produces a fair amount of noise.

Ultrareview changes two things.

The first is parallelism. Many agents look at the change at once, each poking at different parts. That catches issues a single pass would skip.

The second is verification. Every finding gets independently reproduced in the sandbox before it is reported. Anthropic's framing is that the results focus on real bugs rather than style suggestions, which is the actual bottleneck in AI code review today. The signal-to-noise ratio is usually what kills these tools in production.

If it holds up, this is closer to a junior engineer running your change through a real test harness than a linter with opinions.

How it stacks against the regular /review

Claude Code already has a /review command for quick in-session feedback. Ultrareview is not a replacement, it is a second tool for a different moment in the workflow.

/review runs locally, finishes in seconds or a few minutes, and counts toward your normal usage. Good for the tight loop while you are iterating.
/ultrareview runs in the cloud, takes 10 to 20 minutes, uses a fleet of verifying agents, and is meant for pre-merge confidence on substantial changes.

Think of /review as the red pen and /ultrareview as the pre-flight checklist before a big PR leaves your branch.

A few caveats worth knowing

Ultrareview needs you to be logged in with a Claude.ai account, not just an API key. If you are API-only, you run /login first.

It does not work with Claude Code through Amazon Bedrock, Google Cloud Vertex AI, or Microsoft Foundry. It is also not available to organizations that have Zero Data Retention enabled, which matters for regulated teams.

If your repo is too big to bundle, local mode will not work. You push your branch, open a draft PR, and run it in PR mode instead.

These are not deal-breakers, but they do tell you something about the architecture. The sandbox needs your code in its environment to actually execute reproducible checks, which is exactly why the findings are higher quality.

What this means if you build software for a living

For solo devs and small teams

This is the first AI code review tool that feels like it respects your time both ways. It does not block you locally, and it does not flood you with nitpicks. Three free runs on Pro is enough to see whether it catches something real on a PR you actually care about. That is the right way to test it.

For agencies

If you ship client code and you are the only reviewer on your own PRs, ultrareview is basically a senior engineer you can call in before a release. The 10 to 20 minute window fits neatly between "I think this is done" and "I am pushing to production," which is the most dangerous part of any project. Cost per run is worth it the first time it catches a bug that would have become a client email.

For SaaS founders

Two things are worth flagging. First, pre-merge quality is now a cheap competitive advantage, because the tooling is catching up to the effort. If you are shipping fast with a small team, this is one of those features that quietly compounds. Second, keep an eye on cost. Ultrareview bills as extra usage, which means if you run it aggressively on every PR, the bill adds up. Use it on the changes that matter, not on typo fixes.

For larger teams

The Team and Enterprise plans do not get free runs, which tells you Anthropic is pricing this for real usage at scale. The interesting question is whether ultrareview becomes part of CI, triggered automatically on PRs, or stays a developer-triggered command. The docs are clear that it only runs when you invoke it, which is a nice guardrail for now, but the CI integration path is obvious.

The bigger picture

There is a pattern here that is worth calling out.

First we got AI autocomplete. Then chat in the editor. Then agent mode that can actually edit files. Now we are getting specialized agents for specific engineering moments, planning with ultraplan, reviewing with ultrareview, each running in the cloud and using parallelism in ways a local model cannot.

The work is moving off your laptop and into background jobs that do serious thinking while you do something else. The unit of AI help is no longer a single turn in a chat, it is a task that runs for minutes and comes back with a result.

That is a real change in how software gets built. Ultrareview is a clean example of it.

Takeaways

Ultrareview runs a cloud-based, multi-agent code review that independently verifies each finding before reporting it.

It is meant for pre-merge confidence on substantial changes, not as a replacement for the quick local /review.

Pro and Max users get three free runs, then it bills as extra usage. Team and Enterprise go straight to extra usage.

The real shift it represents is AI coding tools moving from in-editor chat to background jobs that run for minutes, in parallel, in the cloud. That direction is not going away.

Sorca Marian

Founder/CEO/CTO of SelfManager.ai & abZ.Global | Senior Software Engineer

https://SelfManager.ai