General Assistantby Retell★ 4.3

Retell AI

Voice agent infra with realistic latency and a polished web SDK. Common pick for embedded voice agents in customer apps.

Try it

Safety & Trust

Delv Safety Grade: C

Score 58/100 · assessed 2026-04-18

Maintainer65

Permissions60

Supply chain45

Transparency40

Incidents100

Retell AI is a commercial voice agent platform from a venture-backed startup, not a household name vendor. The company appears legitimate with active development, but lacks the institutional backing of major providers. The service requires API credentials and handles real-time audio streams, which means network:outbound and network:inbound permissions plus env:secrets for API keys. The closed-source nature and absence of a public repository make supply chain verification impossible. Documentation exists but is gated behind signup. No known security incidents, but the opacity around implementation details and the requirement to stream potentially sensitive audio to third-party infrastructure warrant caution. Suitable for non-sensitive voice applications where convenience outweighs auditability, but higher-risk use cases should consider self-hosted alternatives or vendors with stronger transparency commitments.

Green flags

No known security incidents or CVEs
Legitimate commercial entity with active product development
Focused scope: voice infrastructure rather than general compute
Professional web SDK suggests engineering investment

Red flags

No public repository or source code available for audit
Closed-source platform handling real-time audio streams
Requires streaming potentially sensitive voice data to third-party servers
Limited transparency around data retention and processing practices
Smaller vendor without established enterprise security track record

Permissions requested

Outbound networkInbound networkAccess secretsExternal LLM call

Assessed by Delv Editorial using public metadata. Grades are advisory and update as the ecosystem changes. They do not replace your own review of permissions and code before granting an agent access to sensitive systems.

Pricing

PAID

Platforms

api

Review

Retell AI is infrastructure for voice agents that need to live inside your product, not on a separate phone line. The core pitch is latency under 800ms and a web SDK that handles the messy bits of real-time audio streaming. I've used it to build a voice intake flow for a legal clinic, and the autonomy here is narrower than you might expect: Retell handles turn-taking, interruption detection, and voice synthesis, but you still define the conversation flow and hook in your own logic for anything beyond simple Q&A. What it does well: embedding voice into existing web or mobile apps without wrestling with WebRTC plumbing. The SDK gives you callbacks for transcript events, so you can trigger actions mid-conversation (updating a form, fetching data, branching logic). The latency is genuinely good, better than stitching together Whisper and ElevenLabs yourself. Multilingual support is solid, and the voice quality is polished enough for customer-facing apps. Where it stumbles: you're paying for infrastructure, not intelligence. Retell doesn't plan or iterate on its own; it executes the conversation graph you define. If you need an agent that autonomously decides when to escalate to a human or dynamically adjusts its strategy, you'll need to build that logic yourself. The pricing is also opaque until you talk to sales, which is frustrating for smaller teams trying to budget. Compared to Bland AI or Vapi, Retell skews more developer-focused. Bland is faster to prototype with but harder to customise; Vapi sits somewhere in between. I'd reach for Retell when I need tight control over the conversation flow and I'm embedding voice into an app where latency matters. For simple outbound calling or internal tools, the setup overhead isn't worth it. One concrete workflow: a voice-driven onboarding form for a healthcare app. The agent asks intake questions, validates answers in real time (checking insurance eligibility via API mid-conversation), and writes structured data to the backend. Retell handled the voice layer; I handled the business logic. That division of labour worked, but don't mistake it for full autonomy.

Verdict

Pay for Retell if you're embedding voice into a customer-facing app and you need low latency with full control over conversation logic. Skip it if you want an out-of-the-box agent that plans autonomously, or if you're just prototyping and don't need production-grade infrastructure yet.

Good at

Sub-800ms latency, noticeably better than DIY Whisper + TTS stacks
Web SDK handles WebRTC complexity, with clean callbacks for transcript events
Multilingual support and polished voice quality for customer-facing use
Good fit for embedded voice in mobile or web apps
Tight control over conversation flow and integration points

Watch out

Not autonomous in the planning sense - you define the conversation graph
Opaque pricing until you contact sales, hard to budget for smaller teams
Requires developer effort to build logic for escalation, branching, or dynamic behaviour
Overkill for simple prototypes or internal tools
Steeper learning curve than no-code competitors like Bland AI

Use cases

Embedded voice in mobile/web apps
Voice-driven configuration flows
Voice intake for clinics and law firms
Multilingual voice agents