About
A FastAPI implementation of the Model Control Protocol for ElevenLabs' Scribe speech‑to‑text API, enabling real‑time and batch transcription with advanced context management, language detection, and event handling.
Capabilities
ElevenLabs Scribe MCP Server
The ElevenLabs Scribe MCP Server brings the power of ElevenLabs’ real‑time speech‑to‑text API into the Model Control Protocol ecosystem. By exposing a full MCP implementation, it allows AI assistants to manage transcription sessions as first‑class resources, maintaining context across multiple turns and enabling sophisticated dialogue flows that depend on live voice input.
Solving the Real‑Time Transcription Gap
Traditional speech‑to‑text solutions often treat audio as a static file, requiring separate upload steps and post‑processing. This server eliminates that friction by streaming audio directly from a microphone or other source over WebSocket, delivering incremental transcription results. Developers can therefore build assistants that listen to users in real time, adjust prompts on the fly, or trigger actions as soon as a keyword is detected—all while keeping the conversation context intact.
Core Capabilities
- Bidirectional Streaming: Audio is sent as a continuous stream ( messages), and the server responds with partial or final transcriptions () without waiting for the entire file to finish.
- Context Management: Each session is identified by an message, allowing the assistant to preserve user intent and previous utterances across multiple requests.
- Multi‑format Support: The server automatically converts common audio formats (WAV, MP3, OGG) into the format required by ElevenLabs, simplifying client code.
- Language Detection & Confidence: Every transcription includes a language tag and confidence score, enabling downstream logic to handle multilingual scenarios or prompt for clarification.
- Event Detection: The API can flag speech vs. non‑speech segments, useful for detecting pauses or background noise.
Real‑World Use Cases
- Interactive Voice Assistants: Embed the server in a chatbot that can respond to spoken commands while maintaining conversational context.
- Live Captioning: Provide real‑time captions for webinars or video calls, with the ability to adjust language settings on demand.
- Transcription‑Driven Workflows: Trigger automated tasks (e.g., creating meeting notes, updating CRM records) as soon as a specific phrase is spoken.
- Multilingual Support: Detect the user’s language and route the request to the appropriate model or translate on the fly, all within a single MCP session.
Integration with AI Workflows
Because it follows the Model Control Protocol, the server can be invoked by any MCP‑compliant client. The , , and messages map cleanly onto a conversational state machine, allowing AI assistants to treat transcription as another tool in their arsenal. The WebSocket endpoint () integrates seamlessly with event‑driven frameworks, while the REST endpoints (, ) provide a fallback for batch processing or health monitoring.
Distinctive Advantages
- Unified Protocol: No need to juggle separate HTTP and WebSocket APIs; everything is expressed through MCP messages.
- Low Latency: By streaming audio and returning partial results, the server minimizes the time between speaking and seeing text.
- Extensibility: The modular design (protocol, types, ElevenLabs implementation) makes it straightforward to swap in other ASR backends or add custom processing steps.
- Developer Friendly: Built on FastAPI and Uvicorn, it offers automatic OpenAPI docs and hot‑reload support, reducing the barrier to experimentation.
In summary, the ElevenLabs Scribe MCP Server equips AI assistants with robust, low‑latency speech transcription that is fully integrated into the MCP framework. Its real‑time streaming, context awareness, and rich feature set make it a compelling choice for developers building conversational applications that rely on voice input.
Related Servers
Netdata
Real‑time infrastructure monitoring for every metric, every second.
Awesome MCP Servers
Curated list of production-ready Model Context Protocol servers
JumpServer
Browser‑based, open‑source privileged access management
OpenTofu
Infrastructure as Code for secure, efficient cloud management
FastAPI-MCP
Expose FastAPI endpoints as MCP tools with built‑in auth
Pipedream MCP Server
Event‑driven integration platform for developers
Weekly Views
Server Health
Information
Explore More Servers
OpenAI Image Generation MCP Server
Generate images with OpenAI via MCP
Mcp Sandbox
Quickly test MCP servers in a local sandbox
Oanda MCP Server
REST API for Oanda trading via Model Context Protocol
MCP Dynamic Tool Groups Server
Organize and orchestrate AI tools dynamically
GitHub Enterprise MCP Bridge
AI‑powered GitHub Enterprise license and user insights
BinjaLattice MCP Server
Secure Binary Ninja analysis over HTTP