Higress AI-Search MCP Server

MCP Server

Real‑time web and academic search for LLMs via Higress

Stale(55)

0stars

2views

Updated May 7, 2025

About

A Model Context Protocol server that injects up-to‑date web and academic search results into LLM responses using Higress’s ai-search plugin, supporting Google, Bing, Quark, Arxiv and internal knowledge bases.

Capabilities

Resources

Access data sources

Tools

Execute functions

Prompts

Pre-built templates

Sampling

AI model interactions

Higress AI-Search MCP Server Demo

Overview

The Higress AI‑Search MCP Server bridges large language models (LLMs) with real‑time web and academic search capabilities through the Higress platform. By exposing an MCP tool that performs queries against multiple search engines—Google, Bing, Quark for general web data and ArXiv for scholarly literature—the server empowers AI assistants to enrich their responses with up‑to‑date, authoritative information. This eliminates the “knowledge cutoff” problem that many offline models face and provides a seamless way for developers to surface fresh content directly within conversational flows.

Developers can deploy the server behind Higress, a cloud‑native API gateway that supports WebAssembly extensions. Once configured, the MCP tool automatically routes search requests to Higress’s and plugins, which handle query execution and LLM inference. The result is a single HTTP endpoint that accepts structured search prompts, forwards them to the chosen engine, and streams back concise answers or citations. Because the tool is part of the MCP ecosystem, it can be invoked by any client that understands the protocol—Claude, Gemini, or custom assistants—without needing bespoke integration code.

Key features include:

Multi‑engine search: Switch between Google, Bing, Quark for general queries and ArXiv for academic research.
Internal knowledge integration: Optionally supply company handbooks, policy documents, or other proprietary data sets to keep internal answers consistent and compliant.
Configurable LLM backend: Choose any supported model (e.g., Qwen‑Turbo) via the environment variable, allowing teams to balance cost and performance.
Zero‑code client integration: Once the MCP server is registered, clients can call the tool with a simple JSON payload; no SDKs or custom adapters are required.

Typical use cases span from customer support bots that need to pull the latest product documentation, to research assistants that fetch recent papers, to internal help desks that consult policy manuals. In each scenario the MCP server handles the heavy lifting of search, ranking, and LLM synthesis, letting developers focus on conversational design rather than data retrieval logic. The result is a robust, extensible workflow where AI assistants can deliver accurate, up‑to‑date information in real time.