Multi-Agent Research POC Server

MCP Server

Local‑first multi‑agent research with Ollama and Brave Search

Stale(50)

1stars

0views

Updated Jun 3, 2025

About

A lightweight MCP server that orchestrates two agents—Searcher and Synthesizer—to perform live web research via Brave Search (API or MCP plugin) and synthesize insights, all powered by local LLMs on Ollama.

Capabilities

Resources

Access data sources

Tools

Execute functions

Prompts

Pre-built templates

Sampling

AI model interactions

Overview

The Multi Agent Research POC is a local‑first, multi‑agent framework that demonstrates how autonomous AI assistants can orchestrate research tasks by combining powerful open‑source language models with live web search. Built around the Model Context Protocol (MCP), it allows developers to expose a minimal set of tools—currently a Brave Search interface—to the agents and have them generate dynamic tool calls using a simple syntax. The system is designed to showcase the full MCP workflow: an AI client sends a prompt, receives a structured response that may include a tool call, the server dispatches that call to an external service (or another MCP endpoint), and the result is streamed back for further processing.

What Problem Does It Solve?

In many research or data‑collection scenarios, a single large language model is insufficient because it lacks up‑to‑date knowledge or the ability to perform complex queries. The POC addresses this by splitting responsibilities across two collaborating agents—Searcher and Synthesizer. The Searcher performs real‑time web queries via Brave Search, while the Synthesizer ingests those results and produces concise summaries. This division of labor mirrors how human researchers work: first gather information, then distill it into actionable insights. By keeping all components local (Ollama for LLM inference and a self‑hosted MCP server), the system eliminates latency, preserves privacy, and removes dependence on external APIs for core reasoning.

Core Features & Capabilities

Local LLM Integration: Uses Ollama to run models such as on a developer’s own hardware, ensuring low‑latency inference and full control over the model’s behavior.
Tool‑Call Detection: A lightweight parser identifies tags in model output, automatically routing calls to the appropriate dispatcher.
Brave Search Access: Two modes are supported—direct API calls to the Brave Search endpoint or routing through a separate MCP plugin server. This flexibility lets teams choose between simplicity and full protocol compliance.
Agent Collaboration: The framework ships with two agents, but the architecture is agnostic to additional roles. Developers can add planners, summarizers, or domain‑specific tools without altering the core engine.
Extensible Tool Registry: A registry module maps tool names to handler functions, making it trivial to plug in new services such as CrunchbaseSearch or TwitterTrends.

Use Cases & Real‑World Scenarios

Rapid Market Intelligence: A startup can query the latest AI startups in a region, automatically retrieve search results, and generate a concise briefing for stakeholders.
Academic Literature Review: Researchers can instruct the Searcher to fetch recent papers, then have the Synthesizer produce a literature map highlighting key findings.
Internal Knowledge Base Creation: Companies can feed queries to the system, have it pull current web data, and store the synthesized summaries in an internal wiki or Obsidian vault.
Chatbot Enhancement: Embedding the MCP server into a conversational agent allows the bot to answer up‑to‑date questions by internally invoking web search before responding.

Integration with AI Workflows

Developers can integrate this server into existing MCP‑compatible pipelines. The agent scripts expose a simple API: send a prompt, receive a structured JSON response with optional tool calls. Because the server follows MCP conventions, any client that supports tool invocation—such as Claude or other model‑hosting platforms—can seamlessly interact with it. Moreover, the modular design permits wrapping the entire system as a local REST API or exposing it via a UI framework like Chainlit, FastAPI, or Discord bots.

Unique Advantages

Local‑First Design: Eliminates external network dependencies for inference, providing faster response times and stronger data privacy guarantees.
Protocol‑First Approach: By adhering strictly to MCP, the server can interoperate with any future tools or agents that adopt the same standard without code changes.
Rapid Extensibility: Adding a new tool is as simple as inserting a function and updating the registry, encouraging experimentation with diverse data sources.
Hackathon‑Ready: Originally conceived for the Microsoft AI Agents Hackathon, it showcases a production‑grade architecture that can be expanded into commercial or enterprise deployments with minimal friction.

Overall, the Multi Agent Research POC demonstrates how a small, well‑structured MCP server can empower developers to build sophisticated, privacy‑preserving research assistants that combine local LLM inference with live web data, all while maintaining a clean