About
The Peekaboo MCP Server provides lightning‑fast screen captures, AI image analysis, and full GUI automation for macOS. It enables AI assistants to interact with any app using natural language commands and precise UI element detection.
Capabilities

Overview
Peekaboo is a macOS‑centric Model Context Protocol (MCP) server that transforms raw visual information into actionable AI context. It resolves a long‑standing gap in AI workflows: the ability to see what’s on the screen and then act upon it without leaving the assistant. By exposing a rich set of GUI‑related capabilities—fast screenshots, AI vision analysis, and full‑blown automation—Peekaboo lets developers embed visual intelligence into their AI agents with minimal friction.
What Problem Does Peekaboo Solve?
Traditional AI assistants operate purely on textual input and output, making it difficult to interact with desktop applications that rely on visual cues. Developers often resort to brittle scripting or manual workarounds when an assistant needs to open a file, click a button, or read data from a graph. Peekaboo eliminates this friction by providing an MCP interface that offers instant, reliable access to the screen state and precise control over GUI elements. This enables agents that can, for example, parse a spreadsheet directly from the display or automatically fill out forms in a native app—all within a single conversation.
Core Capabilities and Value
- Lightning‑fast screenshot capture of windows, screens, or custom regions without disrupting focus.
- AI‑powered image analysis that supports GPT‑4.1 Vision, Claude, Grok, or local Ollama models, turning pixel data into structured text.
- Full GUI automation (v3) with click, type, scroll, and drag primitives that work on any macOS application.
- Natural‑language automation via an embedded AI agent that interprets commands like “Open TextEdit and write a poem.”
- Smart UI element detection that maps buttons, text fields, links, and menu items to coordinates, enabling zero‑click extraction of menus and shortcuts.
- Multi‑screen awareness for window placement and display management.
- Privacy‑first design with optional local inference, keeping visual data on the machine.
These features give developers a single, unified API to observe and control the desktop, dramatically reducing the boilerplate needed for visual AI tasks.
Use Cases & Real‑World Scenarios
- Automated UI testing: Agents can capture screenshots of test runs, analyze error dialogs with vision models, and automatically click “Retry” or “Close.”
- Data extraction from legacy apps: A bot can read tables rendered in a proprietary desktop app, convert them to CSV, and pass the data back into a workflow.
- Assistive technology: Vision‑enabled assistants can describe screen content to visually impaired users or perform actions on their behalf.
- Developer tooling: IDE extensions like Cursor can leverage Peekaboo to let an assistant navigate the UI of external tools, install plugins, or trigger builds.
- Remote support: Agents can provide step‑by‑step guidance by capturing the current screen, analyzing it, and instructing users with precise click coordinates.
Integration into AI Workflows
Peekaboo is designed to plug directly into existing MCP‑compatible assistants. An agent can request a screenshot, feed it to an on‑device or cloud vision model, and then issue automated actions—all within the same conversational context. The server’s automatic session resolution ensures that commands always target the most recent window or application, removing the need for manual state tracking. Developers can chain commands into scripts, embed them in prompts, or expose them as custom tools for end‑users.
Unique Advantages
- Zero‑dependency on external services: All core functionality runs locally, preserving privacy and eliminating latency.
- Unified API surface: The same MCP endpoints cover both observation (screenshots, UI element lists) and manipulation (clicks, typing), simplifying client code.
- Performance‑oriented architecture: The native macOS app component delivers a 100× speed boost over pure CLI spawning, making real‑time interaction feasible.
- Extensible design: The PeekabooCore library can be reused in other projects, and new tools (e.g., additional AI models) can be added without altering the server contract.
In short, Peekaboo equips AI assistants with visual awareness and desktop control, turning passive conversation into active, context‑rich interaction on macOS.
Related Servers
Netdata
Real‑time infrastructure monitoring for every metric, every second.
Awesome MCP Servers
Curated list of production-ready Model Context Protocol servers
JumpServer
Browser‑based, open‑source privileged access management
OpenTofu
Infrastructure as Code for secure, efficient cloud management
FastAPI-MCP
Expose FastAPI endpoints as MCP tools with built‑in auth
Pipedream MCP Server
Event‑driven integration platform for developers
Weekly Views
Server Health
Information
Explore More Servers
Ableton Vibe
Voice-controlled MIDI automation for Ableton Live
Antom MCP Server
Seamless AI-Driven Payment Processing
HDW MCP Server
LinkedIn data & account management via HorizonDataWave
Portfolio Manager MCP Server
AI‑powered investment portfolio management and analysis
PiAPI MCP Server
Generate media via Claude with PiAPI integration
GitHub PR & Issue Analyser MCP Server
Automate GitHub PR and issue workflows via LLMs