MCPSERV.CLUB
DTDucas

CHM to Markdown Converter

MCP Server

Convert Revit API CHM docs to clean, AI‑ready Markdown

Stale(55)
28stars
3views
Updated 19 days ago

About

A Python tool that extracts HTML from Compiled HTML Help files and converts them into organized, versioned Markdown documentation with syntax‑highlighted code snippets, updated links, and AI‑friendly indexing.

Capabilities

Resources
Access data sources
Tools
Execute functions
Prompts
Pre-built templates
Sampling
AI model interactions

CHM to Markdown Converter – MCP Server Overview

The CHM to Markdown converter is an MCP server that bridges the gap between legacy Compiled HTML Help (CHM) documentation and modern, AI‑friendly Markdown. Many engineering and architectural teams still ship product manuals and API references as CHM files, which are difficult to version‑control, search, or ingest into language models. By converting these archives into clean Markdown with a structured folder layout, the server enables developers to treat documentation as first‑class data—ready for indexing, semantic search, or direct consumption by AI assistants.

At its core, the server extracts HTML files from CHM archives using 7‑Zip and then applies a series of transformation rules to produce well‑formatted Markdown. Special attention is given to code snippets, tables, and internal hyperlinks: language tags are preserved for syntax highlighting, table markup is corrected to Markdown standards, and relative links are rewritten so that cross‑references remain intact. The result is a repository of Markdown files that mirror the original CHM structure but are now lightweight, diff‑friendly, and easily parsed by tooling such as static site generators or AI prompt builders.

Key capabilities include:

  • Version‑aware processing: Supports multiple Revit API releases (2022–2026), automatically generating a separate sub‑folder for each version.
  • AI integration artifacts: Produces JSON index files (, ) and a navigational Markdown file () that expose metadata, keyword lookups, and a hierarchical outline—perfect for feeding into AI search indices or knowledge‑graph builders.
  • Asynchronous, batch execution: Leverages Python’s async I/O to process hundreds of files concurrently, with configurable worker counts and batch sizes for optimal throughput on modern multi‑core machines.
  • Extensible cleaning pipeline: Developers can fine‑tune which HTML tags, classes, or IDs to strip, allowing the same tool to adapt to other CHM‑based products beyond Revit.

Typical use cases span from internal documentation pipelines—where a team regularly publishes new API versions—to external content migration projects, such as converting legacy help files into GitHub‑hosted docs or integrating them with an AI assistant that answers developer questions. By exposing these capabilities through MCP, the server can be invoked directly from a Claude or other LLM environment, allowing an assistant to fetch, transform, and serve up the latest Markdown documentation on demand.

In summary, the CHM to Markdown MCP server turns brittle, binary help files into structured, searchable text that aligns with modern AI workflows. Its emphasis on version control friendliness, automated indexing, and asynchronous performance makes it a standout tool for teams looking to modernize legacy documentation while keeping AI integration seamless.