SSE MCP Server Demo

MCP Server

Real‑time LLM tool execution with SSE and MCP

Stale(55)

13stars

1views

Updated Sep 17, 2025

About

A Spring Boot server exposing mathematical and date/time tools via Server‑Sent Events, implementing the Model Context Protocol for tool discovery. It enables LLM clients to call tools in real time with OAuth2 security and observability.

Capabilities

Resources

Access data sources

Tools

Execute functions

Prompts

Pre-built templates

Sampling

AI model interactions

prometheus

Overview

The LLM SSE MCP with Observability demo demonstrates how an AI assistant can seamlessly interact with external tools in real time while being fully observable through modern monitoring stacks. By combining the Model Context Protocol (MCP) with Server‑Sent Events (SSE), this server enables low‑latency, event‑driven communication between a language model client and a set of lightweight utility services. Developers can therefore build chat experiences that feel instantaneous, even when the assistant must perform calculations or fetch time‑sensitive data.

At its core, the MCP server exposes a small but powerful toolkit: arithmetic operations (, ) and date‑time helpers (, ). The tools are discovered automatically by any MCP‑capable client, and invoked through a simple JSON payload. Because the server streams responses via SSE, the language model receives partial results as they are produced, allowing for more natural, conversational flows. This is particularly valuable in scenarios where the assistant needs to perform a sequence of operations or update the user progressively, such as solving multi‑step math problems or monitoring an alarm schedule.

The demo’s architecture is split into three Spring Boot applications. An OAuth 2.0 authorization server protects the MCP endpoints, issuing bearer tokens that clients must present. The SSE MCP server hosts the tools and implements the MCP contract, while a separate web client aggregates multiple LLM providers (Claude, GPT, Gemini, Ollama) and connects to the MCP server. The client’s chat UI is built with Thymeleaf, showing how a front‑end can drive the entire workflow: from token acquisition to tool invocation and result rendering.

Observability is a standout feature. Docker‑Compose brings together Prometheus, Tempo, Loki, and Grafana, giving developers instant access to metrics, traces, and logs. Monitoring the MCP server’s performance—through request counts, latency, error rates—and correlating them with LLM usage offers deep insights into system health and user experience. This level of instrumentation is essential when deploying AI assistants in production, where latency and reliability directly impact user satisfaction.

In real‑world applications, this MCP server can power a variety of use cases: financial advisors that compute portfolio metrics on demand, scheduling assistants that set reminders and alarms, or educational tutors that solve math problems step‑by‑step. By leveraging SSE for instant feedback and MCP for discoverable tooling, developers can create AI experiences that are both responsive and maintainable, while the integrated observability stack ensures they remain reliable at scale.