Pdfsearch Zed MCP Server

MCP Server

Semantic PDF search for Zed AI Assistant

Stale(55)

3stars

1views

Updated Jul 2, 2025

About

A Python-based MCP server that builds a vector index of PDF (and optional) documents using OpenAI embeddings, enabling Zed's AI Assistant to retrieve relevant sections via semantic search.

Capabilities

Resources

Access data sources

Tools

Execute functions

Prompts

Pre-built templates

Sampling

AI model interactions

PDF Search for Zed – MCP Server Overview

The PDF Search MCP server fills a common gap for developers building AI‑powered editing workflows: it turns static PDF documents into rich, semantically searchable knowledge bases that can be injected directly into an AI assistant’s context. Traditional text extraction from PDFs is brittle and often produces low‑quality embeddings, but this server leverages OpenAI’s embedding model to convert document chunks into high‑dimensional vectors, enabling fuzzy semantic queries that surface the most relevant passages.

At its core, the server exposes a simple “search” capability to Zed’s AI Assistant. Once a PDF (or a collection of PDFs and Markdown files) is indexed, the assistant can receive a command. The server performs a vector similarity search, retrieves the top matching segments, and automatically injects them into the assistant’s prompt context. This tight integration means that a developer can ask contextual questions about a design spec or legal contract while staying inside the editor, without manually copying text or opening separate tools.

Key features include:

Semantic indexing of PDFs: Documents are split into manageable chunks, embedded with OpenAI embeddings, and stored in a vector database.
Support for multiple PDFs: A single index can contain dozens of files, allowing cross‑document queries.
Optional support for other file formats: Markdown and plain text files can be indexed alongside PDFs, expanding the searchable corpus.
Customizable result size: Developers can tweak how many passages are returned to balance relevance and prompt length.
Future‑proof architecture: Planned self‑contained embeddings and automated index building will reduce external dependencies.

Real‑world use cases abound: legal teams can query contractual clauses across thousands of PDFs; product managers can pull technical details from design documents; researchers can retrieve specific experiment results from archived reports—all without leaving the codebase. By embedding search results directly into the AI’s context, the server removes friction from information retrieval and accelerates decision‑making.

Integration is straightforward: the MCP server runs as a Python process, while the Zed extension registers it in the editor’s configuration. Once configured, any AI prompt can invoke , and the assistant behaves as if the relevant text were typed manually. This seamless blend of local tooling, cloud embeddings, and AI context injection gives developers a powerful, developer‑centric search layer that scales with their documentation needs.