Re: [DISCUSS] Apache Airflow official MCP Server

Bryan Corder Thu, 29 May 2025 18:51:09 -0700

In order to bring value, we might want to think beyond just wrapping the
API. As Kaxil just showed, it's easy to create something with 10 lines of
code and FastMCP.


However, the Airflow API was made for Airflow operators' consumption, not
necessarily for LLM consumption. When you have an endpoint called "Delete
DAG" with a description "Delete a specific DAG" that's very easy for any
user who has already navigated to the Airflow API spec to understand, but
maybe not the best tool description for an LLM. I think we'd want to either
exclude that or add additional context for the LLM to know it's destructive.

In addition, LLMs can struggle with tool selection when you give it 80
tools to work with. Things in the middle sometimes get lost in the context.
There are ways to customize the FastMCP (
https://gofastmcp.com/servers/openapi#custom-route-maps) to cut down the
list of options, should you choose.

However, it may be better to create something more tailored to LLMs.
Thinking about the use case of getting LLM assistance with debugging a
failed run, one of the things my teams do is put the "run book" for prod
support in the doc_md notes right with the DAG, so if a file never shows up
they know exactly what to do in that situation (potentially, do nothing).
We also include other information like, "xx task can be flaky. If you get
this error, rerunning it will usually resolve it." The goal is for any
engineer armed with the stack trace and the run book to be able to solve
any error. My team has all that information right in the UI. To get that
information, the LLM would need to know to hit the DAG Details endpoint for
one minor attribute amongst several for the doc_md and get the correct dag
id, run id, task id and try number to grab the stack trace from the failed
run. It would then need to go elsewhere to find the DAG code to debug. I
think it would be better to just create a "debug_failed_task" tool an LLM
could call from an MCP server that would string those calls together and
serve them up to the LLM on a silver platter. The LLM could focus all its
"reasoning" efforts on solving the problem instead of figuring out how to
get the information it needs to even begin.

Again, if we just want to wrap the API in FastMCP, we can share Kaxil's 10
lines of code in a Medium article and be done. I think the real value is in
providing an implementation of a limited set of more complex base tools
like debug_failed_task (described above), pause_all_active_DAGs (because
I'm about to upgrade!), describe_DAG (grabs only the description,
dependencies, converts cron schedule to human readable if applicable, etc)
and giving people a way to extend the server.

The above is tool focused. As Avi pointed out, there are also resources and
prompts, but I've only personally worked with tools and have nothing to add
there.

With all the LLM tools quickly advancing on the development side (e.g. code
generation/review), it's great to see the community working on building
tools to help with the operational side.

Bryan


On Thu, May 29, 2025, 4:50 PM Kaxil Naik <kaxiln...@gmail.com> wrote:

> One more comment: MCP SDKs have advanced quite a bit and I was able to get
> an Airflow MCP Server working with just the following code block. I was
> successfully able to pause/unpause a DAG from Claude and other MCP client
> as an example. So as much as possible we should utilize higher level
> abstraction like FastMCP which allows creating client from OpenAPI spec
> <https://gofastmcp.com/servers/openapi#openapi-integration>:
>
>     import os
>
>     import httpx
>     from fastmcp import FastMCP
>
>     token = os.environ.get("AF_ACCESS_TOKEN")
>     client = httpx.AsyncClient(
>         base_url="http://localhost:28080";,
>         headers={"Authorization": f"Bearer {token}"},
>     )
>
>     openapi_spec = httpx.get("http://localhost:28080/openapi.json";).json()
>
>     mcp = FastMCP.from_openapi(
>         openapi_spec=openapi_spec,
>         client=client,
>         name="Airflow 3.0 API Server"
>     )
>
>     if __name__ == "__main__":
>         mcp.run()
>
>
>
> On Thu, 29 May 2025 at 20:32, Avi <a...@astronomer.io.invalid> wrote:
>
> > @Shahar -- Yes. Definitely. Feel free to reachout if you need anything.
> >
> > I totally agree, it to live as a separate repo.
> >
> > - Avi
> >
> > On Thu, May 29, 2025 at 12:50 PM Kaxil Naik <kaxiln...@gmail.com> wrote:
> >
> > > @Shahar -- Absolutely, I think you are driving it with this email. So I
> > > think you can lead it from here and whoever wants to join can co-lead
> or
> > > join in development.
> > >
> > > Please feel free to drive :)
> > >
> > > On Thu, 29 May 2025 at 17:07, Aaron Dantley <aarondant...@gmail.com>
> > > wrote:
> > >
> > > > Hey All!
> > > >
> > > > I’d be grateful to be included in the AIP discussions to help if
> > possible
> > > > too! Like Shahar, I’ve never worked on any of these items so it’d be
> > > great
> > > > to see how work gets assigned and goes through a whole development
> > cycle!
> > > >
> > > > Looking forward to it!
> > > > Aaron
> > > >
> > > > On Thu, May 29, 2025 at 7:32 AM Shahar Epstein <sha...@apache.org>
> > > wrote:
> > > >
> > > > > If it's ok, I would like to lead the AIP effort (or at least
> > co-lead),
> > > as
> > > > > I've never written an AIP before. I could start drafting it during
> > the
> > > > next
> > > > > week.
> > > > > Avi - please let me know if it works for you.
> > > > >
> > > > >
> > > > > Shahar
> > > > >
> > > > >
> > > > > On Thu, May 29, 2025, 13:09 Kaxil Naik <kaxiln...@gmail.com>
> wrote:
> > > > >
> > > > > > Yes separate repo, please and we would need someone to lead this
> > > effort
> > > > > on
> > > > > > the proposal & development too. Avi - you are probably well
> > equipped
> > > to
> > > > > > lead it and I am sure more folks like Aaraon would be eager to
> work
> > > on
> > > > > its
> > > > > > development and on-going maintenance.
> > > > > >
> > > > > > Regards,
> > > > > > Kaxil
> > > > > >
> > > > > > On Thu, 29 May 2025 at 15:25, Jarek Potiuk <ja...@potiuk.com>
> > wrote:
> > > > > >
> > > > > > > Yep. Having MCP is cool and drawing our implementation from
> > > > experiences
> > > > > > and
> > > > > > > usage of other MCP servers out there is even cooler (especially
> > > that
> > > > we
> > > > > > can
> > > > > > > have some insights how people already use them with Airflow) -
> if
> > > we
> > > > > can
> > > > > > > bring together a few of those, put some nice, relevant Airflow
> > > > prompts.
> > > > > > > Ideally we could have some examples of how MCP can be used
> taken
> > > from
> > > > > > those
> > > > > > > who are using airflow (the debugging example by Avi is cool)
> > > > > > >
> > > > > > > I am not sure implementing it as provider is really "the way"
> > > though
> > > > -
> > > > > I
> > > > > > > would rather see `apache-airflow-mcp" separate repo - it's so
> > > > different
> > > > > > and
> > > > > > > distinct from airflow it does not really require any of Airflow
> > > > > internals
> > > > > > > and code to be implemented - it makes very little sense to be
> the
> > > > part
> > > > > of
> > > > > > > airflow "workspace" where we would develop it together with
> > > airflow -
> > > > > > > because if it will talk over the REST api, all we need is the
> > > > `client`
> > > > > > that
> > > > > > > might be just a dependency. And there is even no reason for MCP
> > and
> > > > > > airflow
> > > > > > > to be installed and developed together (that's the main reason
> > why
> > > we
> > > > > > want
> > > > > > > providers to be kept in monorepo.
> > > > > > >
> > > > > > > J.
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 29, 2025 at 8:37 AM Amogh Desai <
> > > > amoghdesai....@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Seems like a promising area to invest in given the benefits
> it
> > > can
> > > > > > > provide
> > > > > > > > to
> > > > > > > > the users as mentioned by Shahar and Abhishek.
> > > > > > > >
> > > > > > > > Abhishek also has a promising talk submitted which i am
> looking
> > > > > forward
> > > > > > > to
> > > > > > > > this year at the summit.
> > > > > > > >
> > > > > > > > In any case, this seems to be one of the first of the very
> few
> > > > > > > > implementations of trying
> > > > > > > > to integrate Airflow officially / unofficially with an MCP
> > > server.
> > > > > > > >
> > > > > > > > Thanks & Regards,
> > > > > > > > Amogh Desai
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, May 29, 2025 at 2:56 AM Aaron Dantley <
> > > > > aarondant...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hey!
> > > > > > > > >
> > > > > > > > > I also think this is a great idea!
> > > > > > > > >
> > > > > > > > > Would it be possible to be included in the development
> > process?
> > > > > > > > >
> > > > > > > > > Sorry I’m new to this group, but would appreciate any
> > > suggestions
> > > > > on
> > > > > > > how
> > > > > > > > to
> > > > > > > > > contribute to the MCP server development!
> > > > > > > > >
> > > > > > > > > Regards!
> > > > > > > > > Aaron
> > > > > > > > >
> > > > > > > > > On Wed, May 28, 2025 at 2:57 PM Avi
> > <a...@astronomer.io.invalid
> > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Nice to see the idea to incorporate an official MCP
> server
> > > for
> > > > > > > > > > Airflow. It's been really magical to see what a simple
> LLM
> > > can
> > > > do
> > > > > > > with
> > > > > > > > an
> > > > > > > > > > Airflow MCP server built just from APIs.
> > > > > > > > > >
> > > > > > > > > > A few things that I noticed in my experience:
> > > > > > > > > > - The number of tools that the OpenAPI spec generates is
> > > quite
> > > > > > huge.
> > > > > > > > Most
> > > > > > > > > > tools (*Claude, VS Code with GitHub Copilot, Cursor,
> > > Windsurf*)
> > > > > > which
> > > > > > > > > uses
> > > > > > > > > > mcp-client limits it to a number of 100 tools. (*The
> > > read-only
> > > > > mode
> > > > > > > > > creates
> > > > > > > > > > less tools in comparison*.)
> > > > > > > > > > - MCP server are just not tools. There are other things
> as
> > > > well,
> > > > > > like
> > > > > > > > > > resources and prompts. Prompts are super helpful in case
> of
> > > > > > debugging
> > > > > > > > for
> > > > > > > > > > example. It is a way of teaching LLM about Airflow. Say I
> > > want
> > > > to
> > > > > > > have
> > > > > > > > a
> > > > > > > > > > failing task investigated. A prompt can be helpful in
> > letting
> > > > LLM
> > > > > > > know
> > > > > > > > a
> > > > > > > > > > step-by-step process of carrying out the investigation.
> > > > > > > > > > - Where do you run the MCP server? I wouldn't want my
> > laptop
> > > to
> > > > > do
> > > > > > > the
> > > > > > > > > > heavy processing, which would want us to go for the SSE
> > > instead
> > > > > of
> > > > > > > > stdio.
> > > > > > > > > >
> > > > > > > > > > This is why I chose two different path of using mcp
> server
> > > with
> > > > > > > > airflow,
> > > > > > > > > > which I intend to talk about at the summit.
> > > > > > > > > >
> > > > > > > > > > 1. AI-Augmented Airflow - This helped me add a chat
> > interface
> > > > > > inside
> > > > > > > > > > Airflow using a plugin to talk to an Airflow instance
> (read
> > > > only
> > > > > > > mode).
> > > > > > > > > >
> > > > > > > > > > 2. Airflow-Powered AI - Experimenting with this has been
> > > > totally
> > > > > > > > magical,
> > > > > > > > > > how powerful AI can become when it has access to airflow.
> > > > Also, a
> > > > > > > > > directory
> > > > > > > > > > structure to maintain the DAGs, and it can write DAGs on
> > the
> > > > > fly. I
> > > > > > > > > totally
> > > > > > > > > > see a need where LLMs eventually will need a scheduler,
> > > > although
> > > > > a
> > > > > > > > > complete
> > > > > > > > > > airflow just for an LLM might seem a bit overkill to the
> > rest
> > > > of
> > > > > > the
> > > > > > > > > > community.
> > > > > > > > > >
> > > > > > > > > > I chose to build this on top of open API is because that
> > was
> > > > the
> > > > > > only
> > > > > > > > way
> > > > > > > > > > to get proper RBAC enabled.
> > > > > > > > > >
> > > > > > > > > > I have so many points to discuss. Would love to hear from
> > the
> > > > > > > community
> > > > > > > > > and
> > > > > > > > > > then take it forward.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Avi
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, May 28, 2025 at 6:32 PM Aritra Basu <
> > > > > > > aritrabasu1...@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I definitely think there's potential to interact with
> an
> > > > > airflow
> > > > > > > MCP
> > > > > > > > > > > server. Though I think I'd be interested to see how
> many
> > > and
> > > > > how
> > > > > > > > > > frequently
> > > > > > > > > > > people are making use of MCP servers in the wild before
> > > > > investing
> > > > > > > > > effort
> > > > > > > > > > in
> > > > > > > > > > > building and maintaining one for airflow. I'm sure the
> > data
> > > > is
> > > > > > > > > available
> > > > > > > > > > > out there, just needs finding.
> > > > > > > > > > > --
> > > > > > > > > > > Regards,
> > > > > > > > > > > Aritra Basu
> > > > > > > > > > >
> > > > > > > > > > > On Wed, 28 May 2025, 11:18 pm Julian LaNeve,
> > > > > > > > > > <jul...@astronomer.io.invalid
> > > > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > I think this would be interesting now that the
> > Streamable
> > > > > HTTP
> > > > > > > > spec <
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://modelcontextprotocol.io/specification/2025-03-26/basic/transports>
> > > > > > > > > > > > is out. I think in theory we could publish this first
> > as
> > > an
> > > > > > > Airflow
> > > > > > > > > > > > provider that installs a plugin to expose an MCP
> > > endpoint,
> > > > > as a
> > > > > > > > PoC -
> > > > > > > > > > > this
> > > > > > > > > > > > becomes a much nicer experience than a local stdio
> one.
> > > > > > > > > > > > --
> > > > > > > > > > > > Julian LaNeve
> > > > > > > > > > > > CTO
> > > > > > > > > > > >
> > > > > > > > > > > > Email: jul...@astronomer.io
> > > > > > > > > > > >  <mailto:jul...@astronomer.io>Mobile: 330 509 5792
> > > > > > > > > > > >
> > > > > > > > > > > > > On May 28, 2025, at 1:25 PM, Shahar Epstein <
> > > > > > sha...@apache.org
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Dear community,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Following the thread on Slack [1], initiated by
> Jason
> > > > > > Sebastian
> > > > > > > > > > Kusuma,
> > > > > > > > > > > > I'd
> > > > > > > > > > > > > like to start an effort to officially support MCP
> in
> > > > > > Airflow's
> > > > > > > > > > > codebase.
> > > > > > > > > > > > >
> > > > > > > > > > > > > *Some background *
> > > > > > > > > > > > > Model Context Protocol (MCP) is an open standard,
> > > > > open-source
> > > > > > > > > > framework
> > > > > > > > > > > > > that standardizes the way AI models like LLM
> > integrate
> > > > and
> > > > > > > share
> > > > > > > > > data
> > > > > > > > > > > > with
> > > > > > > > > > > > > external tools, systems and data sources. Think of
> it
> > > as
> > > > a
> > > > > > > "USB-C
> > > > > > > > > for
> > > > > > > > > > > > AI" -
> > > > > > > > > > > > > a universal connector that simplifies and
> > standardizes
> > > AI
> > > > > > > > > > > integrations. A
> > > > > > > > > > > > > notable example of an MCP server is GitHub's
> official
> > > > > > > > > implementation
> > > > > > > > > > > > [3], which
> > > > > > > > > > > > > allows LLMs such as Claude, Copilot, and OpenAI
> (or:
> > > "MCP
> > > > > > > > clients")
> > > > > > > > > > to
> > > > > > > > > > > > > fetch pull request details, analyze code changes,
> and
> > > > > > generate
> > > > > > > > > review
> > > > > > > > > > > > > summaries.
> > > > > > > > > > > > >
> > > > > > > > > > > > > *How could an MCP server be useful in Airflow?*
> > > > > > > > > > > > > Imagine the possibilities when LLMs can seamlessly
> > > > interact
> > > > > > > with
> > > > > > > > > > > > Airflow’s
> > > > > > > > > > > > > API: triggering DAGs using natural language,
> > retrieving
> > > > DAG
> > > > > > run
> > > > > > > > > > > history,
> > > > > > > > > > > > > enabling smart debugging, and more. This kind of
> > > > > integration
> > > > > > > > opens
> > > > > > > > > > the
> > > > > > > > > > > > door
> > > > > > > > > > > > > to a more intuitive, conversational interface for
> > > > workflow
> > > > > > > > > > > orchestration.
> > > > > > > > > > > > >
> > > > > > > > > > > > > *Why do we need to support it officially?*
> > > > > > > > > > > > > Quid pro quo - LLMs become an integral part of the
> > > modern
> > > > > > > > > development
> > > > > > > > > > > > > experience, while Airflow evolves into the go-to
> for
> > > > > > > > orchestrating
> > > > > > > > > AI
> > > > > > > > > > > > > workflows. By officially supporting it, we’ll
> enable
> > > > > multiple
> > > > > > > > users
> > > > > > > > > > to
> > > > > > > > > > > > > interact with Airflow through their LLMs,
> > streamlining
> > > > > > > automation
> > > > > > > > > and
> > > > > > > > > > > > > improving accessibility across diverse workflows.
> All
> > > of
> > > > > that
> > > > > > > is
> > > > > > > > > > viable
> > > > > > > > > > > > > with relatively small development effort (see next
> > > > > > paragraph).
> > > > > > > > > > > > >
> > > > > > > > > > > > > *How should it be implemented?*
> > > > > > > > > > > > > As of today, there have been several
> implementations
> > of
> > > > MCP
> > > > > > > > servers
> > > > > > > > > > for
> > > > > > > > > > > > > Airflow API, the most visible one [4] made by
> > Abhishek
> > > > > Bhakat
> > > > > > > > from
> > > > > > > > > > > > > Astronomer.
> > > > > > > > > > > > > The efforts of implementing it and maintaining it
> in
> > > our
> > > > > > > codebase
> > > > > > > > > > > > shouldn't
> > > > > > > > > > > > > be too cumbersome (at least in theory), as we could
> > > > utilize
> > > > > > > > > packages
> > > > > > > > > > > like
> > > > > > > > > > > > > fastmcp to auto-generate the server using the
> > existing
> > > > > > OpenAPI
> > > > > > > > > specs.
> > > > > > > > > > > I'd
> > > > > > > > > > > > > be very happy if Abhishek could share his
> experience
> > in
> > > > > this
> > > > > > > > > thread.
> > > > > > > > > > > > >
> > > > > > > > > > > > > *Where else could we utilize MCP?*
> > > > > > > > > > > > > Beyond the scope of the public API, I could also
> > > imagine
> > > > > > using
> > > > > > > it
> > > > > > > > > to
> > > > > > > > > > > > > communicate with Breeze.
> > > > > > > > > > > > >
> > > > > > > > > > > > > *How do we proceed from here?*
> > > > > > > > > > > > > Feel free to share your thoughts here in this
> > > discussion.
> > > > > > > > > > > > > If there are no objections, I'll be happy to start
> > > > working
> > > > > on
> > > > > > > an
> > > > > > > > > AIP.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Sincerely,
> > > > > > > > > > > > > Shahar Epstein
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > *References:*
> > > > > > > > > > > > > [1] Slack discussion,
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> https://apache-airflow.slack.com/archives/C06K9Q5G2UA/p1746121916951569
> > > > > > > > > > > > > [2] Introducing the model context protocol,
> > > > > > > > > > > > >
> > https://www.anthropic.com/news/model-context-protocol
> > > > > > > > > > > > > [3] GitHub Official MCP server,
> > > > > > > > > > > > https://github.com/github/github-mcp-server
> > > > > > > > > > > > > [4] Unofficial MCP Server made by Abhishek Hakat,
> > > > > > > > > > > > >
> https://github.com/abhishekbhakat/airflow-mcp-server
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Apache Airflow official MCP Server

Reply via email to