About
Mcpdocsearch crawls websites into Markdown, chunks and embeds the content, then serves semantic search tools over MCP for agents like Cursor.
Capabilities
Overview
The MCPDocSearch server transforms static website documentation into a searchable knowledge base that can be queried directly by AI assistants such as Claude or Cursor. By crawling a target site, converting its pages to Markdown, and embedding the resulting text into vector space, the server turns any online documentation into a semantic search engine that can be accessed through the Model Context Protocol. This enables developers to let their AI agents answer questions about internal APIs, configuration guides, or product manuals without hard‑coding the knowledge into prompts.
At its core, MCPDocSearch offers two complementary tools. First, a web crawler () walks through the site hierarchy, filters URLs by depth or keyword, cleans extraneous HTML elements, and stitches all pages into a single Markdown file. Second, the MCP server loads these files from the directory, splits them into meaningful chunks based on headings, and generates embeddings with the model. The resulting vectors are cached, so after the initial indexing the server can start almost instantly even for large documentation sets. The cache is automatically refreshed whenever any Markdown file changes, ensuring that updates to the source site are reflected in search results without manual intervention.
The server exposes three intuitive MCP tools for clients:
- – returns a list of all crawled Markdown files, allowing an assistant to discover which documents are available.
- – provides the hierarchical structure of a selected document, useful for navigation or summarization.
- – performs a semantic search over the embedded chunks, returning the most relevant passages along with their source context.
These tools make it straightforward for an AI workflow to retrieve precise answers from internal docs, suggest next steps in a support conversation, or generate documentation summaries on demand. Because the server runs via transport, it can be launched directly within Cursor or any other MCP‑compatible client, eliminating the need for a separate REST API.
In practice, teams can use MCPDocSearch to keep their AI assistants up‑to‑date with the latest product documentation, support knowledge bases, or developer guides. It is especially valuable in environments where docs are frequently updated and developers want to avoid manual prompt engineering—an assistant can simply query the server for “how do I configure the authentication module?” and receive a contextual excerpt from the latest docs. The combination of automated crawling, semantic embedding, and MCP tooling provides a robust, low‑maintenance solution that scales from small internal wikis to large public documentation portals.
Related Servers
n8n
Self‑hosted, code‑first workflow automation platform
FastMCP
TypeScript framework for rapid MCP server development
Activepieces
Open-source AI automation platform for building and deploying extensible workflows
MaxKB
Enterprise‑grade AI agent platform with RAG and workflow orchestration.
Filestash
Web‑based file manager for any storage backend
MCP for Beginners
Learn Model Context Protocol with hands‑on examples
Weekly Views
Server Health
Information
Explore More Servers
Imagen3-MCP
Generate photorealistic images via Google's Imagen 3.0 through MCP
Pagefind MCP Server
Fast static site search via Pagefind integration
SPINE2D Animation MCP Server
Create Spine2D animations from PSDs with natural language
MCP Server For LLM
Fast, language-agnostic Model Context Protocol server for Claude and Cursor
Gemini MCP Integration Server
AI-powered tool orchestration with Google Gemini and MCP
Mcp Trial
Prototype MCP server for testing and experimentation