About
A lightweight MCP server that extracts text and performs OCR on PDF files, supporting page ranges and negative indexing. It integrates seamlessly with Claude Code CLI for quick content retrieval.
Capabilities
PDF Extraction MCP Server (Claude Code Fork)
The PDF Extraction MCP Server bridges the gap between AI assistants and static PDF documents. By exposing a single, well‑defined tool——the server allows Claude to retrieve text, tables, and other content from PDFs located on the local filesystem. This solves a common pain point for developers who need to feed document data into language‑model pipelines without building custom parsing logic or hosting heavyweight OCR services.
At its core, the server accepts a file path and an optional page specification. The page argument supports comma‑separated ranges, individual numbers, and negative indexing (e.g., for the last page). Internally it leverages a mix of PDF‑parsing libraries (, ) and optional OCR via for scanned images. The result is a plain‑text payload that Claude can immediately consume, annotate, or transform. Because the tool is exposed through MCP, developers can invoke it with a simple prompt like “Extract pages 1‑3 and the last page from ” without leaving their conversational workflow.
Key capabilities include:
- Local file access: No need to upload PDFs to cloud storage; the tool reads directly from disk, preserving privacy and reducing latency.
- Flexible page selection: Supports ranges, individual pages, or the entire document, giving fine‑grained control over extraction.
- OCR fallback: Automatically switches to OCR for scanned or image‑based PDFs, ensuring that text can be retrieved from virtually any PDF format.
- CLI integration: Designed to work seamlessly with the Claude Code command‑line interface, allowing developers to add and manage the server via .
Typical use cases span several domains. In research, a scientist can ask Claude to pull specific sections from technical reports for summarization or citation extraction. Legal teams can retrieve relevant clauses from contracts, while finance professionals might extract tables from quarterly earnings PDFs for automated reporting. Because the server runs locally and is invoked through MCP, it integrates naturally into existing Claude workflows—whether in a terminal session or within a larger automation pipeline that chains multiple MCP tools together.
What sets this fork apart is its focus on reliability with Claude Code. The inclusion of turns the package into a runnable module, and the detailed installation guidance ensures that developers can add the server to their Claude environment without friction. This combination of robust PDF handling, ease of deployment, and tight integration with the MCP ecosystem makes the PDF Extraction Server a valuable asset for any developer looking to enrich AI conversations with document content.
Related Servers
MCP Filesystem Server
Secure local filesystem access via MCP
Google Drive MCP Server
Access and manipulate Google Drive files via MCP
Pydantic Logfire MCP Server
Retrieve and analyze application telemetry with LLMs
Swagger MCP Server
Dynamic API Tool Generator from Swagger JSON
Rust MCP Filesystem
Fast, async Rust server for efficient filesystem operations
Goodnews MCP Server
Positive news at your fingertips
Weekly Views
Server Health
Information
Explore More Servers
BigQuery MCP Server
Empower AI agents to explore BigQuery data effortlessly
RAG-MCP Pipeline Research Server
Local RAG and MCP integration without paid APIs
Pagos Data MCP Server
Retrieve BIN data quickly and easily
Neo4j MCP Chainlit
Chatbot interface for Neo4j with Claude LLM
CloudBrain MCP Servers
AI-Driven DevOps Automation Across Kubernetes, CI/CD, IaC, and Observability
TextIn OCR MCP
OCR and document extraction to Markdown in one go