About
An MCP server that uses Readability.js to fetch and strip web pages, returning clean text and metadata. Ideal for saving readable articles into Obsidian or other note-taking applications.
Capabilities
Overview
The MCP Web Extractor is a lightweight server that brings the power of Readability.js into Model Context Protocol workflows. By exposing a single capability, it turns any public web page into a clean, structured data object that AI assistants can ingest directly. This eliminates the noise of ads, navigation bars, and other non‑essential elements, delivering a distilled article that is ready for summarization, citation, or note creation.
Developers working with AI assistants often need to pull content from the web in a format that is both human‑readable and machine‑friendly. Traditional scraping tools return raw HTML, which requires additional parsing steps. The Web Extractor abstracts this complexity: you supply a URL, and the server returns an object containing , (HTML), (plain text), , and . This uniform structure allows downstream services—such as knowledge‑base builders or content recommendation engines—to consume the data without custom parsing logic.
Key capabilities include:
- Ad‑free extraction – The underlying Readability algorithm removes sidebars, pop‑ups, and other distractions, ensuring that the returned text focuses on the author’s intent.
- Metadata enrichment – In addition to raw content, the server supplies contextual metadata like the article title and site name, which is invaluable for indexing or linking.
- Seamless Obsidian integration – A ready‑made integration script demonstrates how to hook the server into an Obsidian plugin, enabling users to turn a URL into a polished note with a single click.
- MCP‑ready – The server follows MCP conventions, exposing the capability at a standard endpoint (). This makes it trivial to chain the extraction step into larger AI workflows, such as summarization or question‑answering pipelines.
Typical use cases include:
- Knowledge‑base construction – Automatically pull clean articles into a note system or database for later retrieval by an AI assistant.
- Content summarization – Feed extracted text into a summarizer model to generate concise overviews that the assistant can present to users.
- Web‑to‑PDF or Markdown conversion – Use the plain text and metadata to generate formatted documents for archiving or sharing.
- Research assistance – Quickly pull research papers or news articles into an AI‑augmented workspace, removing the need to manually copy and paste.
By providing a consistent, noise‑free content source, the MCP Web Extractor empowers developers to build richer, more reliable AI experiences that interact with the web without wrestling with HTML intricacies.
Related Servers
MarkItDown MCP Server
Convert documents to Markdown for LLMs quickly and accurately
Context7 MCP
Real‑time, version‑specific code docs for LLMs
Playwright MCP
Browser automation via structured accessibility trees
BlenderMCP
Claude AI meets Blender for instant 3D creation
Pydantic AI
Build GenAI agents with Pydantic validation and observability
Chrome DevTools MCP
AI-powered Chrome automation and debugging
Weekly Views
Server Health
Information
Explore More Servers
James Mcp Streamable
Remote MCP server for versatile testing scenarios
Mcp Servers Client Langgraph React Agent
Multi‑server MCP client with prebuilt ReAct agent powered by LangGraph
Pinecone Developer MCP Server
AI-powered integration with Pinecone for developers
Docs‑to‑MCP Server
Turn markdown docs into an AI‑friendly MCP API
Bio MCP Servers
Unified access to biological data agents
MCP Testing Library
CLI tool for running Model Context Protocol tests