Retell AI
Voice agent infra with realistic latency and a polished web SDK. Common pick for embedded voice agents in customer apps.
Delv Safety Grade: C
Score 58/100 · assessed 2026-04-18
Retell AI is a commercial voice agent platform from a venture-backed startup, not a household name vendor. The company appears legitimate with active development, but lacks the institutional backing of major providers. The service requires API credentials and handles real-time audio streams, which means network:outbound and network:inbound permissions plus env:secrets for API keys. The closed-source nature and absence of a public repository make supply chain verification impossible. Documentation exists but is gated behind signup. No known security incidents, but the opacity around implementation details and the requirement to stream potentially sensitive audio to third-party infrastructure warrant caution. Suitable for non-sensitive voice applications where convenience outweighs auditability, but higher-risk use cases should consider self-hosted alternatives or vendors with stronger transparency commitments.
Green flags
- No known security incidents or CVEs
- Legitimate commercial entity with active product development
- Focused scope: voice infrastructure rather than general compute
- Professional web SDK suggests engineering investment
Red flags
- No public repository or source code available for audit
- Closed-source platform handling real-time audio streams
- Requires streaming potentially sensitive voice data to third-party servers
- Limited transparency around data retention and processing practices
- Smaller vendor without established enterprise security track record
Permissions requested
Pricing
Platforms
Review
Pay for Retell if you're embedding voice into a customer-facing app and you need low latency with full control over conversation logic. Skip it if you want an out-of-the-box agent that plans autonomously, or if you're just prototyping and don't need production-grade infrastructure yet.
Good at
- Sub-800ms latency, noticeably better than DIY Whisper + TTS stacks
- Web SDK handles WebRTC complexity, with clean callbacks for transcript events
- Multilingual support and polished voice quality for customer-facing use
- Good fit for embedded voice in mobile or web apps
- Tight control over conversation flow and integration points
Watch out
- Not autonomous in the planning sense - you define the conversation graph
- Opaque pricing until you contact sales, hard to budget for smaller teams
- Requires developer effort to build logic for escalation, branching, or dynamic behaviour
- Overkill for simple prototypes or internal tools
- Steeper learning curve than no-code competitors like Bland AI
Use cases
- Embedded voice in mobile/web apps
- Voice-driven configuration flows
- Voice intake for clinics and law firms
- Multilingual voice agents