MCP Selenium Grid

MCP Server

Scalable browser automation via MCP

Stale(60)

6stars

2views

Updated 17 days ago

About

A Model Context Protocol server that creates and manages Selenium Grid browser instances in Docker or Kubernetes, enabling AI agents and automation tools to run multi‑browser tests with token‑based security.

Capabilities

Resources

Access data sources

Tools

Execute functions

Prompts

Pre-built templates

Sampling

AI model interactions

Overview

The MCP Selenium Grid server is a Model Context Protocol (MCP) implementation that exposes a web‑based API for launching, controlling, and terminating browser instances across Docker or Kubernetes backends. By wrapping the Selenium Grid functionality behind MCP endpoints, it allows AI assistants and automation agents to request a fresh browser session with minimal friction—no manual Docker commands, no Kubernetes manifests, just a single JSON payload. This solves the common pain point of managing browser resources in test pipelines or exploratory AI workflows, where a stateless agent needs to spin up isolated browsers on demand.

At its core, the server offers multi‑browser support for Chrome, Firefox, and Edge. Each browser instance is provisioned in a lightweight container or pod that includes the appropriate WebDriver and optional VNC access. The MCP endpoints expose standard actions such as , , , and , enabling agents to perform navigation, DOM inspection, or visual validation without leaving the MCP context. Token‑based authentication secures these operations, ensuring that only authorized clients can manipulate browser sessions.

The dual deployment mode is a standout feature. In Docker mode, the server pulls minimal base images and spins up containers on demand, making it ideal for local testing or CI environments. In Kubernetes mode (supported via K3s or any compliant cluster), the server orchestrates pods, automatically scaling the number of concurrent sessions and cleaning up resources when agents finish. This flexibility allows teams to start small on a single machine and scale out to a full cluster without changing the MCP client configuration.

Developers using AI assistants benefit from seamless integration with existing MCP workflows. An assistant can request a new browser, perform actions, and retrieve screenshots—all through the same MCP interface that it uses for other tools. Because the server adheres strictly to the MCP specification, any compliant client can interact with it, fostering a plug‑and‑play ecosystem. Real‑world scenarios include automated UI testing triggered by an AI bot, data extraction from dynamic web pages during knowledge‑base construction, or live debugging sessions where the assistant needs to open a browser context to validate a hypothesis.

In summary, the MCP Selenium Grid server bridges AI agents with robust, scalable browser automation. Its clean API, multi‑backend support, and secure operation make it a powerful addition to any developer’s toolchain that relies on MCP for orchestrating external capabilities.