MCPSERV.CLUB
MCP-Mirror

MCP Webscan Server

MCP Server

Scan and analyze web content effortlessly

Stale(50)
11stars
1views
Updated Sep 3, 2025

About

A Model Context Protocol server that fetches, parses, and extracts information from web pages—supporting markdown conversion, link extraction, site crawling, broken link checking, pattern matching, and sitemap generation.

Capabilities

Resources
Access data sources
Tools
Execute functions
Prompts
Pre-built templates
Sampling
AI model interactions

Webscan Server MCP server

The MCP Webscan Server is a purpose‑built Model Context Protocol service that empowers AI assistants to perform comprehensive web content discovery and analysis directly from the conversation. Instead of relying on external browsers or manual scraping, an assistant can invoke a single tool to fetch any page, convert it into plain Markdown for natural language processing, and then extract or validate the links it contains. This tight integration removes friction for developers who need up‑to‑date web data in real time, such as when a user asks for the latest news on a topic or wants to audit a website’s link structure.

At its core, the server offers six versatile tools that cover the entire web‑scraping workflow:

  • fetch-page – Pulls a page’s HTML and renders it as Markdown, optionally narrowing the output with a CSS selector.
  • extract-links – Gathers every hyperlink on a page, returning both the URL and link text.
  • crawl-site – Recursively visits pages up to a configurable depth, building a map of the site’s structure.
  • check-links – Probes each link on a page to identify broken or unreachable resources.
  • find-patterns – Applies a JavaScript‑compatible regular expression to discover URLs that match custom patterns (e.g., all PDF downloads).
  • generate-site-map – Produces a lightweight XML sitemap by crawling the site, useful for SEO audits or content discovery.

These capabilities are especially valuable in scenarios where a developer wants to integrate live web data into an AI workflow without writing custom parsers. For example, a product manager could ask the assistant to “crawl the company’s support site and list all broken help articles,” or a security analyst could request that the assistant “scan this website for external dependencies and report any outdated libraries.” Because each tool returns structured JSON, the assistant can immediately feed the results into downstream prompts or other MCP services—such as summarization or sentiment analysis—creating a seamless pipeline from raw web content to actionable insights.

The server’s design emphasizes simplicity and portability. It runs over stdio, making it compatible with any MCP‑compliant client (Claude Desktop, Glama, or custom agents). Developers can host the service locally or deploy it in a containerized environment; the only requirement is Node.js 18+. Its modular architecture, with separate service classes for each tool, allows contributors to extend or replace individual components without affecting the overall contract.

In short, MCP Webscan provides a ready‑made bridge between AI assistants and the ever‑changing web. By exposing page fetching, link extraction, site crawling, link validation, pattern matching, and sitemap generation as first‑class tools, it enables developers to build richer, data‑driven conversational experiences with minimal effort.