Mcp Veo2 Video Generation Server

MCP Server

Generate videos from text or images using Google Veo2

Stale(55)

30stars

1views

Updated 21 days ago

About

This MCP server exposes Google’s Veo2 model, enabling clients to create short videos from textual prompts or images. It offers both stdio and SSE transports, storing results in a configurable directory.

Capabilities

Resources

Access data sources

Tools

Execute functions

Prompts

Pre-built templates

Sampling

AI model interactions

Video Generation with Veo2 MCP server

The Mcp Veo2 server turns Google’s cutting‑edge Veo2 video generation model into a first‑class MCP service. By exposing the model through well‑defined tools and resources, it lets AI assistants such as Claude create dynamic video content directly from natural language or image inputs without leaving the MCP ecosystem. This bridges a long‑standing gap between textual AI interactions and visual media creation, enabling developers to embed short video generation into chat flows, content‑creation pipelines, or interactive storytelling experiences.

At its core, the server offers two primary tools: and . Clients supply a prompt or an image file, optionally fine‑tuning settings like aspect ratio, duration, and person generation policies. The server forwards these parameters to the Veo2 model via Google’s Gemini API, then streams back a URL‑accessible video resource. Because the output is exposed as an MCP resource, downstream tools can immediately consume or transform the video—such as cropping, adding captions, or uploading to a CDN—without manual file handling.

Key capabilities include:

Text‑to‑video generation, enabling creative visual storytelling from a single sentence or paragraph.
Image‑to‑video conversion, turning still photos into animated clips with realistic motion and lighting.
Configurable video properties (aspect ratio, length, prompt enhancement) that give developers fine control over the final output.
Support for both stdio and Server‑Sent Events (SSE) transports, allowing flexible integration with CLI tools or web‑based workflows.

Real‑world use cases abound: a marketing chatbot that produces short product demo videos on demand; an educational assistant that animates lecture slides into engaging visual summaries; or a game narrative engine that generates cut‑scene footage from story prompts. Because the server follows MCP standards, it plugs seamlessly into existing AI pipelines such as FLUJO or Smithery, letting developers compose multi‑step workflows where a text prompt triggers video generation, which is then passed to a summarization or translation tool.

What sets Mcp Veo2 apart is its turnkey integration with Google’s paid API ecosystem. Developers only need a Gemini key, and the server handles authentication, request routing, and resource management automatically. The result is a lightweight, plug‑and‑play MCP service that transforms any AI assistant into a video creator with minimal effort.