About
A Model Context Protocol server that fetches, parses, and extracts information from web pages—supporting markdown conversion, link extraction, site crawling, broken link checking, pattern matching, and sitemap generation.
Capabilities
The MCP Webscan Server is a purpose‑built Model Context Protocol service that empowers AI assistants to perform comprehensive web content discovery and analysis directly from the conversation. Instead of relying on external browsers or manual scraping, an assistant can invoke a single tool to fetch any page, convert it into plain Markdown for natural language processing, and then extract or validate the links it contains. This tight integration removes friction for developers who need up‑to‑date web data in real time, such as when a user asks for the latest news on a topic or wants to audit a website’s link structure.
At its core, the server offers six versatile tools that cover the entire web‑scraping workflow:
- fetch-page – Pulls a page’s HTML and renders it as Markdown, optionally narrowing the output with a CSS selector.
- extract-links – Gathers every hyperlink on a page, returning both the URL and link text.
- crawl-site – Recursively visits pages up to a configurable depth, building a map of the site’s structure.
- check-links – Probes each link on a page to identify broken or unreachable resources.
- find-patterns – Applies a JavaScript‑compatible regular expression to discover URLs that match custom patterns (e.g., all PDF downloads).
- generate-site-map – Produces a lightweight XML sitemap by crawling the site, useful for SEO audits or content discovery.
These capabilities are especially valuable in scenarios where a developer wants to integrate live web data into an AI workflow without writing custom parsers. For example, a product manager could ask the assistant to “crawl the company’s support site and list all broken help articles,” or a security analyst could request that the assistant “scan this website for external dependencies and report any outdated libraries.” Because each tool returns structured JSON, the assistant can immediately feed the results into downstream prompts or other MCP services—such as summarization or sentiment analysis—creating a seamless pipeline from raw web content to actionable insights.
The server’s design emphasizes simplicity and portability. It runs over stdio, making it compatible with any MCP‑compliant client (Claude Desktop, Glama, or custom agents). Developers can host the service locally or deploy it in a containerized environment; the only requirement is Node.js 18+. Its modular architecture, with separate service classes for each tool, allows contributors to extend or replace individual components without affecting the overall contract.
In short, MCP Webscan provides a ready‑made bridge between AI assistants and the ever‑changing web. By exposing page fetching, link extraction, site crawling, link validation, pattern matching, and sitemap generation as first‑class tools, it enables developers to build richer, data‑driven conversational experiences with minimal effort.
Related Servers
n8n
Self‑hosted, code‑first workflow automation platform
FastMCP
TypeScript framework for rapid MCP server development
Activepieces
Open-source AI automation platform for building and deploying extensible workflows
MaxKB
Enterprise‑grade AI agent platform with RAG and workflow orchestration.
Filestash
Web‑based file manager for any storage backend
MCP for Beginners
Learn Model Context Protocol with hands‑on examples
Weekly Views
Server Health
Information
Explore More Servers
Domain Lookup MCP
Resolve domain names via RDAP and WHOIS
Everything Search MCP Server
Instant file search via Everything Engine
Coupler.io MCP Server
Seamless AI analytics for Coupler.io data flows
Python MCP Server Template
Rapidly build production-ready MCP servers in minutes.
Mcp OpenAI Complete
Text completion bridge for LLMs via MCP protocol
Mcpsshclient MCP Server
Secure SSH client with AI-driven command filtering