MCP on Cloud Run

MCP Server

Deploy a scalable, secure Model Context Protocol server to Google Cloud Run

Stale(60)

3stars

1views

Updated Sep 17, 2025

About

This repository demonstrates how to host a remote MCP server on Cloud Run, enabling scalable, centralized access for AI tools and agents while enforcing authentication.

Capabilities

Resources

Access data sources

Tools

Execute functions

Prompts

Pre-built templates

Sampling

AI model interactions

mcp-on-cloudrun

The MCP on Cloud Run project demonstrates how to host a remote Model Context Protocol (MCP) server in Google Cloud’s fully managed platform, Cloud Run. By moving the MCP server from a local process to a stateless container service, developers can eliminate the overhead of maintaining infrastructure while still benefiting from MCP’s powerful tool‑and‑resource orchestration for large language models. This approach is especially valuable for teams that rely on code assistants, agentic workflows, or any application that needs to attach prompts, external APIs, and dynamic data sources to an LLM without exposing those connections locally.

At its core, the MCP server exposes a set of HTTP endpoints that accept context—prompts, resources, and tools—and return enriched responses from an LLM. Cloud Run’s automatic scaling ensures that the server can handle sudden spikes in traffic, such as a burst of code‑review requests or an agentic routine that interacts with multiple APIs. Because the server is centrally hosted, all developers on a team can point their local MCP clients to the same endpoint; any updates or new tool integrations propagate instantly, eliminating version drift and simplifying collaboration.

Security is a primary concern for remote MCP deployments. Cloud Run’s integration with Identity‑Aware Proxy (IAP) or custom authentication headers allows the server to enforce authenticated requests, ensuring that only authorized clients can invoke LLM calls. Without this protection, the server would be publicly reachable and vulnerable to misuse or abuse of the underlying language model. The repository’s documentation highlights this risk and encourages developers to enable authentication before exposing the service.

Key capabilities showcased in this sample include:

Transport flexibility: MCP’s and transports enable real‑time streaming of LLM responses over HTTP, making the remote server compatible with a wide range of client libraries.
Tool chaining: Clients can register custom tools (e.g., database queries, API calls) that the LLM can invoke during generation, allowing for dynamic, data‑driven interactions.
Resource management: Prompt templates and external knowledge bases can be stored on the server, ensuring consistent context for all requests.

Typical use cases involve integrating the MCP server into IDE extensions that provide on‑the‑fly code suggestions, building autonomous agents that orchestrate multiple APIs (such as GitHub, Jira, or cloud services), or deploying a shared LLM service for enterprise teams that need to enforce consistent policies and monitoring. By centralizing the MCP server on Cloud Run, organizations gain scalability, consistency, and security—all while keeping the developer experience seamless.