MCPSERV.CLUB
NakaokaRei

Swift MCP GUI Server

MCP Server

Control macOS via SwiftAutoGUI with MCP

Stale(60)
45stars
2views
Updated 10 days ago

About

A Model Context Protocol server that lets you programmatically move the mouse, click, type keys, scroll, capture screenshots, and run AppleScript on macOS using SwiftAutoGUI.

Capabilities

Resources
Access data sources
Tools
Execute functions
Prompts
Pre-built templates
Sampling
AI model interactions

Swift MCP GUI Server Overview

The Swift MCP GUI Server bridges the gap between AI assistants and macOS by exposing a rich set of GUI automation tools through the Model Context Protocol (MCP). It allows an AI client—such as Claude or another MCP‑compliant assistant—to perform precise mouse movements, keyboard shortcuts, screen captures, and even execute AppleScript code on a macOS machine. By turning low‑level GUI actions into declarative tools, the server gives developers a powerful way to script and automate user‑interface interactions directly from conversational AI workflows.

At its core, the server provides ten distinct tools that mirror common desktop automation tasks. Developers can move the cursor to any coordinate, click left or right buttons, send complex key combinations, and scroll in all four directions. Beyond input simulation, the server also offers utilities for inspecting the screen: retrieving pixel color values, obtaining screen dimensions, and capturing full or region‑specific screenshots. Each capture tool returns a base64‑encoded JPEG image, with adjustable quality and scale parameters that help keep payload sizes small and avoid network timeouts. The ability to save screenshots locally, optionally cropping to a region, further extends its utility for logging or visual validation.

One of the standout features is AppleScript execution. By exposing a tool that accepts raw AppleScript code, the server unlocks native macOS capabilities—such as manipulating Finder windows, controlling System Preferences, or automating complex workflows that would otherwise require third‑party scripting frameworks. The response format includes the script’s return value when present, giving AI clients immediate feedback on the outcome of their commands.

In practical scenarios, this server is invaluable for building intelligent automation assistants. For example, a customer support bot could guide a user through a multi‑step setup by moving the mouse to specific UI elements, clicking buttons, and verifying visual states via screenshots. A QA engineer could write an MCP client that automatically tests the UI of a macOS application, capturing screenshots before and after actions to compare pixel differences. Developers building end‑to‑end testing pipelines can integrate the server into CI workflows, using AI to generate test scripts that interact with desktop applications without manual intervention.

Because the server is written in Swift and requires only macOS 15 or later, it offers native performance and tight integration with the operating system. The use of MCP means that any client supporting the protocol can tap into these tools, making the solution extensible across different AI platforms. In summary, Swift MCP GUI Server transforms macOS into a programmable canvas for AI assistants, enabling precise, reliable, and high‑level GUI automation that can be orchestrated entirely through natural language or scripted commands.