Databricks MCP Server

MCP Server

MCP-powered bridge to Databricks APIs

Stale(50)

0stars

3views

Updated Mar 22, 2025

About

A Model Completion Protocol server that exposes Databricks REST API functionality as MCP tools, enabling LLMs to manage clusters, jobs, notebooks, and more with async support.

Capabilities

Resources

Access data sources

Tools

Execute functions

Prompts

Pre-built templates

Sampling

AI model interactions

Overview

The JusttryAI Databricks MCP Server is a specialized Model Completion Protocol (MCP) endpoint that bridges large‑language models with the full breadth of Databricks functionality. By exposing Databricks REST APIs as MCP tools, it lets AI assistants—such as Claude or other LLMs—directly manage clusters, orchestrate jobs, and manipulate notebooks without leaving the conversational context. This eliminates the need for manual API calls or separate dashboards, enabling developers to embed data‑engineering workflows into chat‑based interfaces or automated pipelines.

What problem does it solve? Many organizations rely on Databricks for data processing, machine learning, and analytics. However, interacting with the platform typically requires knowledge of its REST API, authentication tokens, or the Databricks UI. The MCP server abstracts these details away: an LLM can issue a high‑level command like “run the sales forecasting job” or “export the marketing notebook to DBFS,” and the server translates that into precise API calls. This reduces friction for data scientists, analysts, and DevOps teams who want to prototype or automate workflows through natural language rather than code.

Key capabilities are delivered as a set of intuitive tools. Developers can list, create, start, or terminate clusters; enumerate jobs and notebooks; execute SQL queries; and explore DBFS paths—all through simple, well‑defined tool calls. Each tool is fully asynchronous, leveraging Python’s to keep the server responsive even under heavy load. The integration with FastAPI provides a lightweight, testable web interface that can be deployed behind any HTTPS gateway or serverless platform.

Real‑world scenarios abound. A data analyst can ask an AI assistant to “spin up a new cluster with 4 workers” and immediately begin running exploratory notebooks, all while the assistant keeps track of the cluster’s lifecycle. A data engineer might instruct the model to “run job 1234 and export the results to S3,” orchestrating complex pipelines without writing scripts. In continuous‑integration contexts, an LLM can trigger Databricks jobs as part of a build pipeline, logging results back to the chat for instant feedback. Because the server exposes tools rather than raw APIs, developers can compose higher‑level workflows—such as “if the job fails, create a new cluster and retry”—in a declarative manner that aligns with AI‑first design principles.

Unique advantages include its tight coupling to the MCP protocol, which is becoming a standard for AI‑tool interaction. This ensures compatibility with any LLM that supports MCP, enabling seamless plug‑and‑play integration across platforms. The server’s async architecture guarantees low latency and high throughput, making it suitable for production workloads where hundreds of tool calls may be issued per second. Finally, the open‑source nature of the project allows teams to audit, extend, or adapt the tool set to their own Databricks customizations—such as adding support for new job parameters or notebook metadata—without vendor lock‑in.