About
A FastMCP server that lets LLMs access and transcribe online videos from YouTube and Bilibili using WhisperX models, with automatic audio extraction, format conversion, and temporary file hosting via 0x0.st.
Capabilities
Overview
The MCP Transcribe Online Videos server solves a common pain point for developers building AI‑powered content analysis tools: turning arbitrary online video streams into structured, timestamped text. By exposing two simple yet powerful tools— and —the server lets a language model request a transcription of any public YouTube or Bilibili video with a single API call. Behind the scenes, the server downloads the video, extracts and normalizes its audio, and forwards it to a cloud‑based WhisperX transcription service. The resulting transcript includes precise timestamps, making it immediately usable for downstream tasks such as summarization, keyword extraction, or subtitle generation.
For AI assistants, this capability is invaluable. Instead of relying on external web scraping or manual download steps, the assistant can ask the MCP server to fetch and transcribe a video on demand. The transcription output is returned in JSON, ready for further processing by the assistant or other services. This streamlines workflows where content creators need to analyze large volumes of video, educators want to generate study notes from lectures, or researchers gather data from media archives.
Key features of the server include:
- Automatic audio extraction: The server uses FFmpeg to pull the audio track from any supported video URL, converting it into a format suitable for WhisperX.
- Timestamped output: Each utterance is paired with start and end times, enabling fine‑grained alignment or subtitle generation.
- Cloud transcription: Leveraging Replicate’s WhisperX models offloads compute from the client, allowing even modest hardware to process long videos efficiently.
- Temporary file hosting: Large audio files are uploaded to a 0x0.st instance, ensuring the server can handle videos of any length without exhausting local storage.
Typical use cases span multiple domains. In education, a virtual teaching assistant could transcribe lecture recordings from YouTube and provide instant summaries or Q&A support. Content creators might generate searchable captions for their channels, improving discoverability and accessibility. Researchers can automate the extraction of spoken data from media archives for linguistic or sociological studies. Because the MCP interface is lightweight, these tools can be chained with other AI services—such as sentiment analysis or topic modeling—to build sophisticated, end‑to‑end pipelines.
Integration with existing AI workflows is straightforward. A developer can instantiate an pointing to the server’s URL and invoke the transcription tools as part of a larger prompt or task. The server’s FastMCP foundation ensures low‑latency, reliable communication, while the environment configuration allows swapping out file storage backends or adding new media sources. The roadmap hints at future enhancements—metadata extraction, local transcription options, and broader platform support—that will further broaden the server’s applicability.
In summary, the MCP Transcribe Online Videos server provides a turnkey solution for converting online video content into structured text, empowering AI assistants to offer richer media‑aware services without the overhead of manual data preparation.
Related Servers
n8n
Self‑hosted, code‑first workflow automation platform
FastMCP
TypeScript framework for rapid MCP server development
Activepieces
Open-source AI automation platform for building and deploying extensible workflows
MaxKB
Enterprise‑grade AI agent platform with RAG and workflow orchestration.
Filestash
Web‑based file manager for any storage backend
MCP for Beginners
Learn Model Context Protocol with hands‑on examples
Weekly Views
Server Health
Information
Explore More Servers
Shrimp Task Manager
AI‑powered task manager that keeps context and breaks down projects
Typescript MCP Demo
Interactive chat with Claude using multiple MCP servers
CircleCI MCP Server
Control CircleCI with natural language commands
Spring Boot AI MCP Client
Connect AI models to external MCP servers with Spring Boot
Mongo MCP Server
Query MongoDB via Model Context Protocol
Starknet MCP Server
AI models accessing Starknet data in real time