Nice to see the idea to incorporate an official MCP server for
Airflow. It's been really magical to see what a simple LLM can do with an
Airflow MCP server built just from APIs.

A few things that I noticed in my experience:
- The number of tools that the OpenAPI spec generates is quite huge. Most
tools (*Claude, VS Code with GitHub Copilot, Cursor, Windsurf*) which uses
mcp-client limits it to a number of 100 tools. (*The read-only mode creates
less tools in comparison*.)
- MCP server are just not tools. There are other things as well, like
resources and prompts. Prompts are super helpful in case of debugging for
example. It is a way of teaching LLM about Airflow. Say I want to have a
failing task investigated. A prompt can be helpful in letting LLM know a
step-by-step process of carrying out the investigation.
- Where do you run the MCP server? I wouldn't want my laptop to do the
heavy processing, which would want us to go for the SSE instead of stdio.

This is why I chose two different path of using mcp server with airflow,
which I intend to talk about at the summit.

1. AI-Augmented Airflow - This helped me add a chat interface inside
Airflow using a plugin to talk to an Airflow instance (read only mode).

2. Airflow-Powered AI - Experimenting with this has been totally magical,
how powerful AI can become when it has access to airflow. Also, a directory
structure to maintain the DAGs, and it can write DAGs on the fly. I totally
see a need where LLMs eventually will need a scheduler, although a complete
airflow just for an LLM might seem a bit overkill to the rest of the
community.

I chose to build this on top of open API is because that was the only way
to get proper RBAC enabled.

I have so many points to discuss. Would love to hear from the community and
then take it forward.

Thanks,
Avi



On Wed, May 28, 2025 at 6:32 PM Aritra Basu <aritrabasu1...@gmail.com>
wrote:

> I definitely think there's potential to interact with an airflow MCP
> server. Though I think I'd be interested to see how many and how frequently
> people are making use of MCP servers in the wild before investing effort in
> building and maintaining one for airflow. I'm sure the data is available
> out there, just needs finding.
> --
> Regards,
> Aritra Basu
>
> On Wed, 28 May 2025, 11:18 pm Julian LaNeve, <jul...@astronomer.io.invalid
> >
> wrote:
>
> > I think this would be interesting now that the Streamable HTTP spec <
> >
> https://modelcontextprotocol.io/specification/2025-03-26/basic/transports>
> > is out. I think in theory we could publish this first as an Airflow
> > provider that installs a plugin to expose an MCP endpoint, as a PoC -
> this
> > becomes a much nicer experience than a local stdio one.
> > --
> > Julian LaNeve
> > CTO
> >
> > Email: jul...@astronomer.io
> >  <mailto:jul...@astronomer.io>Mobile: 330 509 5792
> >
> > > On May 28, 2025, at 1:25 PM, Shahar Epstein <sha...@apache.org> wrote:
> > >
> > > Dear community,
> > >
> > > Following the thread on Slack [1], initiated by Jason Sebastian Kusuma,
> > I'd
> > > like to start an effort to officially support MCP in Airflow's
> codebase.
> > >
> > > *Some background *
> > > Model Context Protocol (MCP) is an open standard, open-source framework
> > > that standardizes the way AI models like LLM integrate and share data
> > with
> > > external tools, systems and data sources. Think of it as a "USB-C for
> > AI" -
> > > a universal connector that simplifies and standardizes AI
> integrations. A
> > > notable example of an MCP server is GitHub's official implementation
> > [3], which
> > > allows LLMs such as Claude, Copilot, and OpenAI (or: "MCP clients") to
> > > fetch pull request details, analyze code changes, and generate review
> > > summaries.
> > >
> > > *How could an MCP server be useful in Airflow?*
> > > Imagine the possibilities when LLMs can seamlessly interact with
> > Airflow’s
> > > API: triggering DAGs using natural language, retrieving DAG run
> history,
> > > enabling smart debugging, and more. This kind of integration opens the
> > door
> > > to a more intuitive, conversational interface for workflow
> orchestration.
> > >
> > > *Why do we need to support it officially?*
> > > Quid pro quo - LLMs become an integral part of the modern development
> > > experience, while Airflow evolves into the go-to for orchestrating AI
> > > workflows. By officially supporting it, we’ll enable multiple users to
> > > interact with Airflow through their LLMs, streamlining automation and
> > > improving accessibility across diverse workflows. All of that is viable
> > > with relatively small development effort (see next paragraph).
> > >
> > > *How should it be implemented?*
> > > As of today, there have been several implementations of MCP servers for
> > > Airflow API, the most visible one [4] made by Abhishek Bhakat from
> > > Astronomer.
> > > The efforts of implementing it and maintaining it in our codebase
> > shouldn't
> > > be too cumbersome (at least in theory), as we could utilize packages
> like
> > > fastmcp to auto-generate the server using the existing OpenAPI specs.
> I'd
> > > be very happy if Abhishek could share his experience in this thread.
> > >
> > > *Where else could we utilize MCP?*
> > > Beyond the scope of the public API, I could also imagine using it to
> > > communicate with Breeze.
> > >
> > > *How do we proceed from here?*
> > > Feel free to share your thoughts here in this discussion.
> > > If there are no objections, I'll be happy to start working on an AIP.
> > >
> > >
> > > Sincerely,
> > > Shahar Epstein
> > >
> > >
> > > *References:*
> > > [1] Slack discussion,
> > >
> https://apache-airflow.slack.com/archives/C06K9Q5G2UA/p1746121916951569
> > > [2] Introducing the model context protocol,
> > > https://www.anthropic.com/news/model-context-protocol
> > > [3] GitHub Official MCP server,
> > https://github.com/github/github-mcp-server
> > > [4] Unofficial MCP Server made by Abhishek Hakat,
> > > https://github.com/abhishekbhakat/airflow-mcp-server
> >
> >
>

Reply via email to