About
A lightweight FastAPI server that facilitates real-time chat interactions with large language models such as Ollama or OpenAI. It manages sessions, tool approvals, and configurable LLM settings via environment variables or CLI arguments.
Capabilities
Overview
The Mcp Http Host is a lightweight, FastAPI‑based server that exposes an LLM chat interface to AI assistants through the Model Context Protocol. It bridges a local or cloud‑based language model—whether an Ollama instance or an OpenAI‑compatible endpoint—with the MCP ecosystem, allowing assistants to send user messages, receive model responses, and invoke external tools in a structured manner. By abstracting the details of model invocation behind standard HTTP endpoints, developers can plug the server into any MCP‑compliant workflow without modifying their assistant code.
At its core, the server handles four key interactions: starting a new chat session, forwarding user messages to the LLM, receiving tool call suggestions from the model, and approving or denying those calls. When a user message arrives, the server forwards it to the configured LLM provider, optionally streams the reply, and returns a JSON payload that includes the assistant’s text, any proposed tool usage (with arguments), and a unique request identifier. The assistant can then decide whether to execute the suggested tool; if it chooses to, it posts an approval request back to the server, which in turn triggers the tool execution pipeline. This two‑step approval flow keeps sensitive operations under explicit user control while still enabling the model to suggest powerful actions.
Key capabilities of the server include:
- Provider agnosticism: Switch between Ollama and OpenAI backends simply by setting environment variables or CLI flags.
- Dynamic configuration: All settings—model name, temperature, context window size, base URLs—are adjustable at launch time, facilitating experimentation and rapid iteration.
- Session management: Each chat session is isolated with its own working directory and state, ensuring that file‑system interactions remain contained.
- Tool integration: The server exposes a standard tool approval endpoint, allowing assistants to leverage external commands (e.g., filesystem access, code execution) while maintaining safety.
- Streaming support: When enabled, the server streams partial responses from the LLM, improving latency for long outputs.
Typical use cases span rapid prototyping of code assistants, educational tooling where students can interact with a local LLM, and internal devops pipelines that need to run model‑guided scripts on a server. For example, an engineering team can deploy the MCP host locally, configure it to use a high‑capacity coder model, and let their AI assistant automatically fetch files, run tests, or generate documentation—all mediated through the MCP protocol. The server’s clear separation of concerns and minimal configuration overhead make it an attractive choice for developers who want to harness LLMs in a controlled, tool‑aware environment.
Related Servers
MarkItDown MCP Server
Convert documents to Markdown for LLMs quickly and accurately
Context7 MCP
Real‑time, version‑specific code docs for LLMs
Playwright MCP
Browser automation via structured accessibility trees
BlenderMCP
Claude AI meets Blender for instant 3D creation
Pydantic AI
Build GenAI agents with Pydantic validation and observability
Chrome DevTools MCP
AI-powered Chrome automation and debugging
Weekly Views
Server Health
Information
Explore More Servers
Aip MCP Server
Local and SSE-based Model Context Protocol server samples for quick prototyping
MCP Easy Copy
Quickly list and copy MCP services for Claude Desktop
Augments MCP Server
Real‑time framework documentation for Claude Code
Crypto Price Tracker MCP Server
Real‑time crypto watchlist with Google Sheets export
Shodan MCP
Unleash Shodan’s power via the Model Context Protocol
Supabase MCP Server on Phala Cloud
Secure Supabase integration in a TEE-enabled cloud environment