VoiceMode
Voice interaction server with speech-to-text, text-to-speech, and real-time voice conversations via local mic and LiveKit.
Delv Safety Grade: C
Score 58/100 · assessed 2026-04-28
VoiceMode is a community-maintained MCP server enabling voice interactions through speech-to-text, text-to-speech, and real-time conversations via local microphone and LiveKit. The project is maintained by a solo developer (Michael Bailey) with limited visibility into maintenance patterns. It requires OpenAI API credentials and requests significant permissions including microphone access, network connectivity to external services (OpenAI, LiveKit), and environment variable access for API keys. The supply chain is reasonably standard via uvx/PyPI distribution, though the custom installer package (voice-mode-install) adds a layer of indirection. Documentation appears adequate based on repository structure. The permissions scope is broad, combining desktop audio capture with external API calls, which presents meaningful attack surface. No known security incidents exist, but the combination of microphone access and API key handling warrants careful consideration in sensitive environments.
Lethal Trifecta (prompt-injection exposure)
TWO OF THREELocal microphone audio is private; outbound to the speech-to-text API.
Green flags
- Open source repository allows code inspection and community review
- Standard PyPI distribution via uvx follows Python ecosystem best practices
- No known security incidents or malicious activity reported
- Clear documentation of required API credentials upfront
Red flags
- Microphone access combined with external API calls increases data exfiltration risk
- Solo maintainer with limited public track record reduces bus factor confidence
- Custom installer package adds supply chain complexity vs direct installation
- Requires API key storage in environment variables without key rotation guidance
- LiveKit integration adds third-party service dependency with unclear data handling
Permissions requested
Install
uvx voice-mode-install
OPENAI_API_KEYReview
Install this if you code by talking through problems or need hands-free access to Claude. Skip it if you're not already comfortable with MCP servers, or if you'd rather not add another API bill to your stack. It's a niche tool that does its job well for the people who need it.
Good at
- Real-time voice conversations, not just dictation, thanks to LiveKit integration.
- Works with both Claude Desktop and Claude Code out of the box.
- Handles local microphone input cleanly without extra hardware.
- Genuinely useful for hands-free coding and accessibility workflows.
Watch out
- Requires an OpenAI API key and burns through credits during extended conversations.
- Community project with rougher documentation than official MCP servers.
- Network-dependent for all speech processing, so latency can interrupt flow.
- Not the easiest first MCP server if you're new to the ecosystem.
Use cases
- voice conversations with Claude
- hands-free coding
- accessibility workflows
- voice-driven dictation
Getting started
Works with
Similar MCPs
- DaVinci Resolve MCPFull coverage of the DaVinci Resolve scripting API so agents can drive timelines, edits, colour grading, and media management via Claude.
- Free Will MCPExperimental tools that let an AI give itself prompts, ignore user requests, or go to sleep, for studying autonomy.
- Godot MCPInteracts with the Godot game engine for scene editing, running, debugging, and project management.
- QGIS MCPConnects QGIS Desktop to Claude for prompt-assisted project creation, layer loading, and code execution.