MCPSERV.CLUB
SarthakMishra

Site Cloner MCP Server

MCP Server

Clone entire websites with LLM-powered tools

Stale(50)
1stars
1views
Updated Jun 3, 2025

About

A Docker‑based MCP server that lets LLMs fetch, analyze, and download website assets—HTML, CSS, images, fonts—and generate sitemaps for full site cloning.

Capabilities

Resources
Access data sources
Tools
Execute functions
Prompts
Pre-built templates
Sampling
AI model interactions

Site Cloner MCP Server

The Site Cloner MCP server empowers large‑language models to act as full‑stack web scrapers and site duplicators. In many AI‑driven development workflows, a user may ask an assistant to “clone this website” or “create a local copy of example.com.” Without direct access to the web, an LLM would need an external tool that can fetch pages, resolve relative paths, and persist assets. Site Cloner fills this gap by exposing a set of high‑level tools that handle every step of the cloning process, from initial HTML retrieval to final asset download and site‑map generation.

At its core, the server offers six tightly coupled tools. fetch_page pulls raw HTML from any reachable URL. extract_assets parses that HTML to pull out links to CSS, JavaScript, images, fonts, and other resources. download_asset then downloads each referenced file into a structured local directory, preserving the original relative paths. parse_css_for_assets goes one level deeper by inspecting CSS files for references, ensuring that font and background image assets are not missed. create_site_map crawls a site to an adjustable depth, yielding a navigable map of pages that can be used for further analysis or incremental cloning. Finally, analyze_page_structure provides a quick structural overview of any fetched page, useful for UI‑testing or content extraction.

For developers building AI‑powered tools, this server delivers several practical advantages. First, it removes the need to write custom web‑scraping code for each new project; the LLM can simply invoke the pre‑defined tools, keeping the developer’s focus on higher‑level logic. Second, because the server runs in Docker and exposes a simple command interface, it can be launched on any machine that supports containers, ensuring consistent behavior across environments. Third, the asset‑resolution logic handles relative URLs and CSS‑embedded resources automatically, which is often a source of bugs in manual scrapers.

Typical use cases include automated documentation generation for static sites, migration of legacy web pages to new hosting platforms, or creating offline copies for compliance audits. In a Cursor workflow, a user can configure the MCP once and then ask Claude to clone a site; the assistant will orchestrate the sequence of tool calls, returning a fully‑structured local copy ready for inspection or deployment. The server’s modular design also allows developers to extend it with custom tools—such as image optimization or HTML minification—without touching the core logic.

In summary, Site Cloner is a turnkey MCP solution that transforms an LLM into a web‑cloning agent. By handling the intricacies of page fetching, asset resolution, and site mapping, it lets developers leverage AI assistants for end‑to‑end website duplication tasks with minimal overhead.