MCPSERV.CLUB
MCP-Mirror

MCP Server: Scalable OpenAPI Endpoint Discovery and API Request Tool

MCP Server

Instant semantic search for private OpenAPI endpoints

Stale(50)
1stars
1views
Updated Apr 10, 2025

About

A FastAPI-based MCP server that loads remote OpenAPI specs, chunks them per endpoint, and uses in-memory FAISS with MiniLM-L3 embeddings to quickly locate and return full endpoint docs for Claude Desktop clients. It also includes a tool to execute RESTful requests.

Capabilities

Resources
Access data sources
Tools
Execute functions
Prompts
Pre-built templates
Sampling
AI model interactions

MCP Server in Action

Overview

The Baryhuang MCP Server Any OpenAPI is a lightweight, FastAPI‑based solution that turns any publicly hosted or privately stored OpenAPI specification into an instant, semantic‑searchable API gateway for Claude and other MCP clients. It addresses a common pain point: large OpenAPI documents (hundreds of kilobytes or more) that overwhelm Claude’s built‑in MCP parsing, causing errors or incomplete endpoint discovery. By loading the spec into memory and chunking it by individual endpoints, the server preserves full context while keeping processing times in milliseconds.

What It Solves

When an assistant needs to call a REST API, it must first understand which endpoint to target and what parameters are required. Traditional MCP clients struggle with large specs, leading to failures or the need for manual pre‑processing (e.g., converting JSON to YAML). This server bypasses those limitations by providing an in‑memory vector index of all endpoints, allowing Claude to query “list products” or any natural‑language request and instantly receive the exact OpenAPI path, method, and schema. The assistant can then construct a fully‑validated request without additional user guidance.

Key Features

  • Remote Spec Loading – Pulls OpenAPI JSON from a URL, eliminating local file management and ensuring the server always reflects the latest API version.
  • Semantic Search – Uses a compact MiniLM‑L3 model (≈43 MB) to embed endpoint descriptions, enabling quick relevance ranking with FAISS.
  • Endpoint‑Level Chunking – Each endpoint is treated as a separate chunk, so the assistant receives precise parameter information and avoids context loss.
  • FastAPI & Async – The server is built on an async framework, delivering sub‑second responses even under load.
  • Multi‑Instance Configuration – A single configuration file can spawn multiple MCP servers, each pointing to a different OpenAPI spec and API prefix (e.g., , ).

Real‑World Use Cases

  • Private API Exposure – Companies can expose internal services to Claude without exposing the raw OpenAPI file or building custom tools.
  • Rapid Prototyping – Developers can spin up a local MCP server for any third‑party API, instantly enabling conversational exploration and testing.
  • Multi‑Tenant Environments – A single deployment can host dozens of APIs, each isolated by a unique prefix and base URL, simplifying governance.

Integration with AI Workflows

Claude or any MCP‑compatible client simply sends a natural‑language query to the server. The server returns the exact OpenAPI endpoint and its parameter schema, which the client uses to build a compliant request. The response is then streamed back through MCP, allowing the assistant to present results or prompt for additional input. Because the server handles all parsing and request execution, developers can focus on higher‑level business logic rather than API plumbing.

Unique Advantages

  • Zero Local Storage – No need to keep large OpenAPI files on disk; the server fetches and caches them in memory.
  • Fast Discovery – Semantic search with FAISS guarantees instant endpoint lookup, even for specs over 100 KB.
  • Extensibility – The configuration supports arbitrary numbers of APIs, making it ideal for environments with many services or frequent spec updates.

In summary, the Baryhuang MCP Server Any OpenAPI provides a seamless bridge between Claude and complex REST APIs, turning natural‑language queries into fully‑validated API calls with minimal latency and no manual intervention.