MCPSERV.CLUB
Wooonster

HOCR MCP Agent

MCP Server

Handwritten OCR processing with a Vue front‑end and fast Python backend

Stale(60)
2stars
0views
Updated Aug 17, 2025

About

The HOCR MCP Agent provides a lightweight Model Context Protocol server for processing handwritten OCR data. It pairs a Vue.js client with a Uvicorn‑based FastAPI backend, enabling quick deployment of OCR workflows.

Capabilities

Resources
Access data sources
Tools
Execute functions
Prompts
Pre-built templates
Sampling
AI model interactions

HOCR MCP Agent in Action

Overview

The HOCR MCP Agent is a specialized Model Context Protocol (MCP) server designed to bridge AI assistants with the HOCR format—a structured representation of OCR‑processed documents. By exposing a set of well‑defined resources and tools, the server allows Claude or other AI clients to retrieve, parse, and manipulate HOCR data without handling low‑level parsing logic. This abstraction is invaluable for developers building AI‑powered document analysis pipelines, as it removes the need to write custom parsers and provides a consistent API surface that follows MCP conventions.

Solving the OCR Data Integration Gap

OCR engines produce raw text and positional metadata, but most developers struggle to convert this into actionable information. HOCR encapsulates both the textual content and its spatial layout, yet parsing it is non‑trivial. The HOCR MCP Agent solves this problem by offering a ready‑made interface that ingests HOCR files and returns structured objects representing words, lines, paragraphs, and bounding boxes. Developers can then feed these objects directly into downstream AI models or analytics tools, dramatically reducing the time from OCR output to insight.

Core Features and Capabilities

  • HOCR Parsing Service: Accepts HOCR files or URLs, parses the XML/HTML structure, and returns a JSON representation of the document hierarchy.
  • Spatial Metadata Exposure: Provides bounding box coordinates for every text element, enabling precise layout analysis or visual overlay generation.
  • Search and Retrieval Tools: Exposes query endpoints that let AI assistants locate specific words or phrases within the document, returning context and positional data.
  • Batch Processing Support: Handles multiple HOCR documents in a single request, facilitating large‑scale document ingestion workflows.
  • MCP‑Compliant Endpoints: All resources follow MCP naming conventions, making the server discoverable by AI assistants that automatically enumerate available tools.

Real‑World Use Cases

  • Legal Document Review: Lawyers can prompt an AI assistant to extract clauses from scanned contracts, using the server’s spatial data to preserve clause boundaries.
  • Academic Research: Researchers can feed scanned theses into the agent, then query for specific terminology or citation patterns across pages.
  • Invoice Automation: Accounting systems can parse invoices, extract line items and totals, and feed the structured data into finance software.
  • Accessibility Enhancements: Screen readers can request text and layout information to generate accurate audio descriptions of complex documents.

Integration with AI Workflows

The server’s MCP endpoints integrate seamlessly into existing AI assistant workflows. An AI client can request the “HOCR parse” tool, receive a structured payload, and then apply natural language understanding or summarization models on the extracted text. Because the server adheres to MCP’s resource discovery protocol, developers can programmatically list available tools and incorporate them into dynamic conversation flows without hardcoding URLs.

Unique Advantages

Unlike generic OCR APIs, the HOCR MCP Agent preserves layout information, enabling use cases that depend on spatial context—such as table extraction or visual layout analysis. Its MCP compliance ensures discoverability and composability with other MCP servers, allowing developers to chain multiple specialized services (e.g., image enhancement + HOCR parsing) in a single conversational loop. This combination of structured output, spatial fidelity, and protocol adherence makes the HOCR MCP Agent a standout solution for developers seeking robust document intelligence within AI‑driven applications.