meGPT

MCP Server

Personalized LLM built from an author’s own content

Stale(55)

278stars

1views

Updated 21 days ago

About

meGPT aggregates an author’s books, blogs, social media archives, podcasts, videos and other materials to train a language model that can answer questions and generate summaries in the author’s voice. It supports bulk extraction from YouTube, Twitter, Medium, Mastodon and more.

Capabilities

Resources

Access data sources

Tools

Execute functions

Prompts

Pre-built templates

Sampling

AI model interactions

Overview of the meGPT MCP Server

The meGPT server tackles a common pain point for creators and experts who wish to turn their extensive personal archives into an AI‑powered conversational agent. Rather than relying on commercial persona builders, meGPT offers a self‑hosted, open‑source solution that ingests an author’s own books, blog posts, social media archives, podcasts, videos, and slide decks. By compiling this data into a structured knowledge base, the server trains or fine‑tunes a large language model that can answer questions and generate summaries in the author’s distinctive voice. This empowers individuals to monetize their expertise, preserve their legacy, or provide a personalized support tool without ceding control of their content to third‑party services.

At its core, the server exposes an MCP interface that supplies a suite of resources and tools to AI assistants. The resource layer contains the curated corpus—text from PDFs, RSS feeds, GitHub projects, and transcription of audio/video content. The tool layer offers functions such as search, summarize, and cite that an assistant can invoke to retrieve relevant passages, produce concise overviews, or reference the original source. A prompt template guides the assistant to respond in the author’s tone, while a sampling configuration controls creativity and factuality. These components work together to enable seamless interaction between the assistant and the author’s data, allowing developers to embed a highly personalized chatbot into applications ranging from customer support to educational platforms.

Key capabilities include:

Automated content ingestion: Scripts extract and normalize data from Twitter archives, Medium RSS feeds, YouTube URLs (individual videos, playlists, channels), and local PDFs or slide decks.
Multimodal processing: Audio from podcasts and YouTube videos is transcribed, providing rich Q&A material; slide images are converted to text for contextual understanding.
Open‑source licensing: The repository is released under Creative Commons Attribution Share‑Alike, encouraging community contributions and reuse for other authors.
Low‑friction development: The codebase was generated by AI assistants (ChatGPT 4, Claude Sonnet) and is intentionally simple for non‑Python experts to modify.

Real‑world use cases span from an author creating a virtual Q&A companion for book readers, to a technical speaker offering on‑demand explanations of their talks, to an educator building a domain‑specific tutor that cites original research. In each scenario, the MCP server’s structured data and toolset allow an AI assistant to pull authoritative answers directly from the author’s own material, ensuring authenticity and consistency.

Integrating meGPT into an AI workflow is straightforward: developers register the server’s MCP endpoint, configure the assistant’s tool set to include search and summarize, and optionally supply a prompt that mimics the author’s voice. The assistant can then query the server, retrieve relevant snippets, and compose responses that reference the original sources. This tight coupling eliminates hallucinations and provides traceable explanations, a critical requirement for domains where accuracy is paramount.