About
A lightweight MCP server that enables programmatic control of a computer’s mouse, keyboard, and screen. It offers screenshot capture, OCR extraction, window management, and drag‑and‑drop actions—all with zero external dependencies.
Capabilities

The Computer Control MCP bridges the gap between conversational AI and real‑world desktop interaction. By exposing a rich set of tools that mimic the core functions of a human operator—mouse movement, keyboard entry, screen capture, and OCR—the server enables AI assistants to manipulate applications, automate workflows, and extract information directly from the user’s environment. This is particularly valuable for developers building AI‑powered productivity agents, remote support bots, or automated testing suites that must act on a live desktop rather than a simulated environment.
At its core, the server implements a straightforward set of actions that map cleanly onto common GUI operations. Mouse tools allow precise clicks, drags, and button state control; keyboard utilities enable typing arbitrary text or pressing individual keys. Screen tools provide full‑screen or window‑specific screenshots, while the integrated OCR engine (RapidOCR on ONNXRuntime) can pull textual data from those images, returning both the extracted string and its coordinates. Window management commands list open windows and bring a chosen window to the foreground, making it trivial for an AI agent to switch context or target a specific application.
Developers can harness these capabilities in several real‑world scenarios. An AI assistant could navigate a spreadsheet, automatically fill out forms, or pull data from a web dashboard by first taking a screenshot and running OCR to locate the relevant fields. In testing, an agent could simulate user interactions across multiple applications, validate UI states via OCR, and report failures back to a CI pipeline. Remote support bots can guide users through complex setups by controlling the host machine, while ensuring that every action is logged and auditable.
Integration with existing MCP workflows is seamless. The server registers its tools under standard names, allowing any Claude or similar client to discover and invoke them through the usual MCP prompt‑tool interface. Because it relies only on lightweight Python libraries (PyAutoGUI, RapidOCR, ONNXRuntime) and has no external binaries, the server can be deployed in isolated environments or containerized setups without additional dependency headaches. This zero‑dependency stance also reduces attack surface and simplifies compliance checks.
In summary, the Computer Control MCP turns an AI assistant into a fully functional desktop operator. Its blend of mouse, keyboard, screenshot, and OCR tools gives developers the means to automate complex GUI tasks, extract data on‑the‑fly, and build intelligent agents that interact with the real world as seamlessly as they converse.
Related Servers
Netdata
Real‑time infrastructure monitoring for every metric, every second.
Awesome MCP Servers
Curated list of production-ready Model Context Protocol servers
JumpServer
Browser‑based, open‑source privileged access management
OpenTofu
Infrastructure as Code for secure, efficient cloud management
FastAPI-MCP
Expose FastAPI endpoints as MCP tools with built‑in auth
Pipedream MCP Server
Event‑driven integration platform for developers
Weekly Views
Server Health
Information
Explore More Servers
IoT Device Control MCP Server
Standardized IoT device control via Model Context Protocol
React Design Systems MCP
Unified React component knowledge & code generation
Maven Dependencies MCP Server
Instant Maven version checks and updates
Teamwork MCP Server
Connect AI to Teamwork.com Projects Seamlessly
Descope MCP Server
Access Descope project data via a lightweight MCP interface
GitHub MCP Server
LLM-powered GitHub automation via Model Context Protocol