Codingby Anthropic★ 4.3

Claude Code

Anthropic's terminal agent that plans, edits, runs tests, and commits - fully interactive with your repo. The coding agent with the most MCP depth.

Try it Also in Tools: AI Code & Dev →

Safety & Trust

Delv Safety Grade: B

Score 72/100 · assessed 2026-04-18

Maintainer95

Permissions35

Supply chain65

Transparency60

Incidents100

Claude Code is Anthropic's official autonomous coding agent with full filesystem and shell access. The maintainer score is excellent given it's a first-party Anthropic product. However, permissions are extremely broad: it writes arbitrary files, executes shell commands, and commits to repos without sandboxing. The supply chain score is middling because there's no public repository or package distribution - it's delivered through Anthropic's infrastructure, which is trustworthy but opaque. Transparency suffers from lack of open source code or detailed technical documentation about safety boundaries. No known incidents, but the autonomy level (multi-file edits, test execution, git operations) means a prompt injection or logic error could cause significant repo damage. Suitable for experienced developers who understand the risk surface, less so for production environments or shared codebases without careful oversight.

Green flags

Official Anthropic product with enterprise-grade maintainer
MCP-native architecture allows scoped tool integration
Interactive approval loops reduce blind automation risk
No known security incidents or credential leaks
Designed for terminal use where user oversight is natural

Red flags

No public repo or source code available for audit
Unrestricted filesystem write and shell execution without sandbox
Autonomous git commits could push breaking changes
Broad permissions with minimal documented safety boundaries
Opaque distribution model limits supply chain verification

Permissions requested

Read filesWrite filesDelete filesShell executeRepo readRepo writeOutbound networkRead env

Assessed by Delv Editorial using public metadata. Grades are advisory and update as the ecosystem changes. They do not replace your own review of permissions and code before granting an agent access to sensitive systems.

Pricing

FREEMIUM

Platforms

cliweb

Review

Claude Code is Anthropic's terminal agent that actually stays in the loop while you work. Unlike Cursor or Aider, which hand you diffs and wait for approval, this thing plans a multi-step approach, edits files, runs your test suite, reads the failures, and tries again. The autonomy matters most when you're refactoring across a dozen files or chasing down a flaky test - tasks where context switching kills momentum. I used it to migrate a Flask API to FastAPI. Gave it the spec, pointed it at the repo, and watched it rewrite route handlers, update imports, and fix type hints across twenty-odd files. It caught itself when a test failed due to a missing async keyword and corrected it without me stepping in. That loop - edit, test, fix, repeat - is where it earns the 'agent' label. You're supervising, not micromanaging. The MCP integration is the differentiator here. Claude Code can pull live data from your database, hit internal APIs, or query your issue tracker mid-task because it speaks MCP natively. Cursor can't do that without you writing custom tooling. If your workflow involves more than just code - say, checking Sentry for error patterns or validating against a staging environment - this is the only coding agent with first-class support for those actions. Failure modes: it sometimes overwrites comments you care about, and it's slower than Copilot for single-function edits. The planning phase can feel verbose when you just want a quick fix. It also assumes you're working in a repo with tests - if you don't have a suite, half the value disappears. Compared to Cursor, Claude Code is better for multi-file work and worse for inline autocomplete. Compared to Aider, it's more autonomous but less transparent about what it's changing. If you're prototyping alone or doing deep refactors, this is the tool. If you're pair programming or want tight control over every diff, stick with something more manual.

Verdict

Pay for this if you do solo refactors, have a test suite, and want an agent that iterates without hand-holding. Skip it if you prefer inline suggestions or work in codebases without tests.

Good at

Autonomously loops through edit-test-fix cycles without human approval per step
Native MCP support lets it query live systems, databases, and APIs mid-task
Handles multi-file refactors with full repo context better than Cursor or Copilot
Plans work upfront so you can review the approach before it starts editing
Commits changes with meaningful messages after validating tests pass

Watch out

Slower than autocomplete tools for single-function edits
Planning phase can feel verbose when you just need a quick fix
Sometimes overwrites comments or formatting you wanted to keep
Value drops sharply if your codebase lacks a test suite
Less transparent than Aider about what it's changing before it does it

Use cases

Long coding sessions with full repo context
Running and fixing failing tests in a loop
Codebase refactors that span many files
Prototyping from a spec in conversation