MCPSERV.CLUB
pingcap

TiDB Python AI SDK

MCP Server

Unified vector, text, and image search for AI apps

Active(73)
27stars
2views
Updated 15 days ago

About

The TiDB Python AI SDK provides a seamless interface to TiDB’s vector, full‑text, and hybrid search capabilities. It automatically embeds data, supports multi‑modal storage, advanced filtering, reranking, and full transaction management for AI-driven applications.

Capabilities

Resources
Access data sources
Tools
Execute functions
Prompts
Pre-built templates
Sampling
AI model interactions

Overview

TiDB AI SDK provides a seamless bridge between the powerful TiDB distributed database and modern AI workloads. By exposing vector, full‑text, and hybrid search capabilities through a unified Python interface, it allows developers to treat structured data as an intelligent knowledge base that can be queried with natural language or multimodal prompts. The SDK automatically handles embedding generation, vector storage, and optional reranking, freeing developers from the complexity of managing embeddings, index creation, or model deployment.

The core value proposition is semantic search over relational data. Traditional SQL queries require exact keyword matches and manual joins, whereas TiDB AI lets a user input a free‑text question or image query. The SDK embeds the query, performs nearest‑neighbor search against stored vectors, and returns records ranked by relevance. This makes it possible to build conversational agents that can answer questions about internal databases, recommend products based on user preferences, or surface the most relevant documents in a knowledge base—all without writing custom inference pipelines.

Key features include:

  • Automatic embedding: Text fields are automatically converted to high‑dimensional vectors using configurable provider models (e.g., OpenAI embeddings). The vector field is stored alongside the original data, enabling instant semantic queries.
  • Multimodal support: Beyond text, the SDK can embed images and other modalities, allowing image‑to‑image or text‑to‑image retrieval directly within the database.
  • Advanced filtering & reranking: After a vector search, developers can apply SQL‑style filters (e.g., ) and optionally rerank results with a dedicated reranker model for finer relevance control.
  • Transactional consistency: All operations—insert, update, delete, and search—are wrapped in TiDB transactions, ensuring ACID guarantees even when embeddings are updated or new data is ingested.

Real‑world use cases span knowledge base retrieval, recommendation engines, anomaly detection in logs, and multimodal content discovery. For example, a support chatbot can query the product database with natural language to fetch the most relevant troubleshooting articles, while an e‑commerce platform can surface images that match a user’s search intent. In data science workflows, analysts can quickly prototype semantic similarity searches without setting up separate vector stores.

Integration into AI pipelines is straightforward: the SDK can be invoked from within an LLM prompt via MCP, enabling a Claude or other assistant to issue a tool call that returns structured results. Because the SDK exposes a standard set of SQL‑like operations, it fits naturally into existing ETL or data‑exploration workflows. Its unique advantage lies in combining the scalability and reliability of TiDB with the flexibility of vector search, all managed through a single Python library that abstracts away the underlying complexity.