Delv
Task AutomationSlow· 1moby Microsoft3.8

AutoGen

Open-source programming framework from Microsoft for building conversational multi-agent applications, now succeeded by Agent Framework.

B
Safety & Trust

Delv Safety Grade: B

Score 72/100 · assessed 2026-04-18

Maintainer95
Permissions40
Supply chain85
Transparency90
Incidents100

AutoGen is Microsoft's open-source multi-agent framework, now in maintenance mode as development shifts to Agent Framework. The maintainer score is excellent given Microsoft's backing and the project's maturity. However, permissions scoring is low because AutoGen is a framework for building autonomous agents that typically require code execution, filesystem access, and network calls to function. The supply chain is solid via PyPI with proper versioning, though users must carefully vet any agents they build or deploy. Transparency is strong with comprehensive documentation and active community. No known security incidents, but the framework's power means developers must implement their own safety guardrails. The successor project (Agent Framework) may offer improved safety patterns.

Green flags

  • Backed by Microsoft with strong institutional support
  • Mature open-source project with extensive documentation
  • Active community and clear migration path to successor framework
  • Well-established PyPI distribution with semantic versioning
  • No known security incidents or malicious versions

Red flags

  • Framework enables arbitrary code execution through agent interactions
  • Now in maintenance mode, active development moved to Agent Framework
  • Safety guardrails depend entirely on developer implementation
  • Multi-agent systems can exhibit emergent behaviours difficult to predict

Permissions requested

Shell executeRead filesWrite filesOutbound networkRead envExternal LLM call
Assessed by Delv Editorial using public metadata. Grades are advisory and update as the ecosystem changes. They do not replace your own review of permissions and code before granting an agent access to sensitive systems.

Pricing

FREEOpen source

Platforms

apicli

Review

AutoGen was Microsoft's first serious attempt at multi-agent orchestration, and it showed real promise before being deprecated in favour of Agent Framework. The core idea remains sound: define agents with distinct roles, let them converse to solve problems, and watch emergent behaviour handle complexity you didn't explicitly programme. I used it for a research assistant pipeline where one agent scraped papers, another summarised findings, and a third synthesised conclusions. The conversational handoffs worked surprisingly well when the task had clear boundaries. The autonomy here is conversational, not environmental. Agents don't browse the web or manipulate files without your plumbing. Instead, they debate, critique each other's outputs, and iterate towards a solution. This shines in code review workflows: one agent proposes changes, another flags edge cases, a third runs tests. The back-and-forth catches mistakes a single LLM pass would miss. Failure modes are predictable. Agents can loop endlessly if termination conditions are vague. Token costs spiral when conversations run long. The framework assumes you understand prompt engineering well enough to define coherent roles. If your agent personas are mushy, the conversations devolve into agreement theatre where nothing meaningful happens. Compared to CrewAI, AutoGen feels lower-level and more research-oriented. CrewAI gives you opinionated patterns for common tasks; AutoGen gives you primitives and expects you to build your own patterns. LangGraph offers finer control over state machines but requires more boilerplate. AutoGen sits in the middle: flexible enough for experimentation, structured enough that you're not writing raw API calls. The deprecation is the elephant in the room. Microsoft has moved active development to Agent Framework, which shares DNA but isn't a drop-in replacement. If you're starting fresh, Agent Framework is the safer bet. AutoGen still works and the community remains active, but you're building on a sunset platform. For research projects or learning multi-agent patterns, it's still valuable. For production systems, the migration risk is real. Documentation is thorough but assumes familiarity with agent concepts. The examples are academic-flavoured, which helps if you're exploring ideas but less so if you need to ship a feature by Friday.
Verdict

Best for researchers and developers learning multi-agent patterns who don't mind building on a deprecated framework. Skip it if you need production stability or prefer opinionated tooling. Agent Framework is the successor for new projects.

Good at

  • Conversational agent patterns catch errors single-pass LLMs miss
  • Lower-level primitives allow custom orchestration logic
  • Strong documentation and active community despite deprecation
  • Free and open-source with no vendor lock-in
  • Good middle ground between raw API calls and opinionated frameworks

Watch out

  • Officially deprecated in favour of Agent Framework
  • Agents can loop endlessly without careful termination logic
  • Token costs escalate quickly in multi-turn conversations
  • Requires solid prompt engineering skills to define useful roles
  • No built-in environmental actions like web browsing or file manipulation

Use cases

  • multi-agent research
  • task solving
  • chat patterns