SourceSage

MCP Server

Language-agnostic code memory for LLMs

Stale(50)

7stars

1views

Updated Sep 1, 2025

About

SourceSage is an MCP server that stores and retrieves code entities, relationships, patterns, and style conventions in a token‑efficient graph, enabling LLMs to quickly access language‑agnostic code knowledge.

Capabilities

Resources

Access data sources

Tools

Execute functions

Prompts

Pre-built templates

Sampling

AI model interactions

SourceSage is a Model Context Protocol (MCP) server engineered to give large language models a persistent, structured memory of codebases. By turning source files into a token‑efficient knowledge graph, it lets an LLM “remember” the architecture, patterns, and style conventions of a project across languages without bloating its context window. This solves the perennial problem of scaling code understanding: developers can query a vast repository from within an assistant, receive precise insights about function signatures or inheritance hierarchies, and keep that knowledge up‑to‑date as the code evolves.

The server’s core value lies in its language‑agnostic design. Rather than hard‑coding parsers for each language, SourceSage delegates analysis to the LLM itself. The model parses source files, extracts entities (classes, functions, modules), relationships (calls, imports, inheritance), patterns (common idioms), and style conventions. These are then registered through MCP tools into a compact graph structure that consumes far fewer tokens than raw source. Developers can incrementally update the graph whenever files change, ensuring that the assistant’s memory reflects the current state without re‑ingesting entire codebases.

Key capabilities include:

Token‑Efficient Storage – The graph representation stores only essential metadata, allowing thousands of entities to fit within a modest token budget.
Fast Retrieval – Queries against the graph return relevant entities or patterns in milliseconds, enabling real‑time assistance during coding sessions.
Incremental Updates – New or modified files trigger targeted re‑analysis, preventing redundant data and keeping the graph lightweight.
Rich Toolset – MCP tools such as , , , and give developers fine‑grained control over what is remembered.

In practice, SourceSage shines in scenarios where an AI assistant must navigate large, multi‑language codebases. For example, a developer can ask the assistant to list all classes that implement a particular interface across Python and JavaScript, or to suggest refactorings that align with established style conventions. The assistant can also surface patterns—like a common factory method implementation—so that newcomers quickly grasp idiomatic usage. Because the server is tightly integrated with MCP, it plugs seamlessly into existing AI workflows: a client like Claude can launch the server via configuration, then invoke tools to populate or query the memory during a conversation.

What sets SourceSage apart is its lightweight, LLM‑driven approach to code memory. By leveraging the model’s own understanding of syntax and semantics, it sidesteps the need for language‑specific parsers while still delivering precise, actionable knowledge. This makes it an attractive choice for teams looking to embed deep code awareness into their AI assistants without incurring the overhead of traditional static analysis pipelines.