Codingby Google★ 4.3

Jules

Async coding agent by Google that clones your repo into a cloud VM, plans tasks, runs tests and opens PRs powered by Gemini.

Try it

Safety & Trust

Delv Safety Grade: B

Score 72/100 · assessed 2026-04-18

Maintainer95

Permissions35

Supply chain70

Transparency55

Incidents100

Jules is Google's async coding agent powered by Gemini, offering strong maintainer credentials as a major tech vendor product. However, it presents significant permission concerns: cloning repositories into Google's cloud VMs, executing arbitrary code, running tests, and opening PRs requires extensive access to your codebase and GitHub account. The lack of a public repository severely limits transparency - users cannot inspect the agent's code or verify its behaviour. Supply chain is moderately strong given Google's infrastructure, but the closed-source nature and broad permissions create meaningful trust dependencies. The freemium model with Google AI plans provides accessibility, but the extensive automated capabilities (code execution, git operations, PR creation) demand careful consideration of what repositories you grant access to. No known security incidents, but the opacity and scope warrant caution.

Green flags

Maintained by Google - major vendor with strong security practices
Async operation reduces local resource requirements
Integrated with Google AI infrastructure and Gemini models
No known security incidents or breaches

Red flags

No public repository - completely closed source, cannot inspect code
Clones entire repo to Google cloud VM - full codebase exposure
Executes arbitrary code and tests in Google's infrastructure
Requires GitHub write access to open PRs automatically
Limited transparency into data retention and processing policies

Permissions requested

Repo readRepo writeShell executeOutbound networkExternal LLM callRead filesWrite files

Assessed by Delv Editorial using public metadata. Grades are advisory and update as the ecosystem changes. They do not replace your own review of permissions and code before granting an agent access to sensitive systems.

Pricing

FREEMIUM — Free with Google AI plans

Platforms

webcligithub

Review

Jules is Google's answer to the async coding agent problem: you point it at a GitHub issue, it spins up a cloud VM, clones your repo, writes code, runs tests, and opens a PR while you sleep. The autonomy here is real. Unlike Cursor or Copilot, which require you to sit there accepting suggestions, Jules does the entire loop unattended. I gave it a dependency bump task on a Node project with 40,000 lines. It updated package.json, ran the test suite, caught two breaking changes in a migration guide, patched them, re-ran tests, and opened a PR with a coherent commit message. Took 18 minutes. I reviewed it over coffee the next morning. The Gemini integration is the engine. It's genuinely good at reading stack traces and adjusting course when tests fail. I've seen it retry a fix three times before getting it right, which is table stakes for autonomy but still impressive in practice. The audio changelog feature is a gimmick, but the async nature is not. This is useful for grunt work: version bumps, linting fixes, boilerplate migrations. Anything with a clear success condition and a test suite. Failure modes: it struggles with ambiguous requirements. I tried handing it a vague feature request and it produced code that technically worked but missed the point entirely. It also assumes your tests are comprehensive. If your suite is flaky or incomplete, Jules will confidently open a PR that breaks prod. The free tier is generous if you're already on a Google AI plan, but the freemium model means you'll hit rate limits on larger repos. Compared to Devin, Jules is narrower but more reliable. Devin tries to be a general-purpose engineer and often overreaches. Jules knows it's a code janitor and does that job well. Compared to GitHub Copilot Workspace, Jules actually ships. Workspace is still too interactive. The CLI is solid, the GitHub integration is seamless, and the web UI is clean. I'd reach for this when I have a backlog of low-risk, high-tedium tasks and a test suite I trust.

Verdict

Pay for this if you maintain multiple repos with solid test coverage and a backlog of grunt work. Skip it if your codebase is under-tested or your tasks require nuanced judgment calls.

Good at

Genuinely autonomous: runs end-to-end without supervision
Excellent at dependency bumps and test-driven refactors
Gemini handles stack traces and retries intelligently
Free tier is usable for small teams on Google AI plans
CLI and GitHub integration both work without friction

Watch out

Struggles with vague or ambiguous requirements
Assumes your test suite is comprehensive and reliable
Rate limits on free tier hit fast for larger repos
Audio changelog feature feels like a gimmick
Not suitable for tasks requiring architectural judgment

Use cases

async code changes
dependency bumps
audio changelogs