At first glance I tend to agree with Jens and Niko.

I understand the request, but I agree that this resolves CI/CD and testing
issues that should probably be remain outside Airflow.

On Mon, Apr 27, 2026 at 7:43 PM Oliveira, Niko <[email protected]> wrote:

> Hey folks!
>
> > P.S. In my opinion, what can be done in/around git, should be done
> there. Recreation of CI/CD in any form inside of Airflow itself is
> something which should not be done.
>
> I'm glad we agree on this :) I suppose we just disagree on what is
> possible outside of Airflow :p
>
> But at this point I will bow out of the conversation and let others weigh
> in. I'm not fully convinced any of these requested behaviours require
> changes to Airflow (I think that's just masking some dev ops work). But
> also I'm not completely opposed to the change either, I'm more on the
> fence, so if others love the feature by all means implement it! :)
>
> Cheers,
> Niko
> ________________________________
> From: Przemysław Mirowski <[email protected]>
> Sent: Thursday, April 23, 2026 3:06 PM
> To: [email protected] <[email protected]>
> Subject: RE: [EXT] [DISCUSS] DAG Version Pinning for Deployment Gating
> (Building on AIP-63)
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez
> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que
> le contenu ne présente aucun risque.
>
>
>
> Hi,
>
> I think that CI/CD and version pining are a little two different things
> here. In a use cases with some critical systems involved, the situation
> when the Dag changes the version to the latest without possibility to
> determine when it will exactly happen (CI/CD will have some more-or-less
> time to deploy the change, the same goes for Dag Processor parsing time) is
> rather hard to do and in some systems it can make change deployment harder
> and less safe. Of course, the ideal solution would be to have proper
> non-prod environment, which is fully representative in comparison to
> production (in some cases exposing non-prod to prod data/traffic/etc. is,
> just, not an option - e.g. security), but it is not always possible to do
> due to various reasons like costs, licenses, space and/or vendors. I'm
> agreeing especially with point 5 of Piyush latest message. Having above in
> mind, I think that version pinning would be a nice addition to the Dag
> Versioning feature with an assumption that it is for critical Airflow Dags
> when full control of the Dags version change time is required (maybe there
> is also another way to achieve that).
>
> P.S. In my opinion, what can be done in/around git, should be done there.
> Recreation of CI/CD in any form inside of Airflow itself is something which
> should not be done.
> ________________________________
> From: Oliveira, Niko <[email protected]>
> Sent: 23 April 2026 01:50
> To: [email protected] <[email protected]>
> Subject: Re: [DISCUSS] DAG Version Pinning for Deployment Gating (Building
> on AIP-63)
>
> Hey Piyush,
>
> Thanks for your reply, I do love how clearly it is written and I see
> exactly the problem you're trying to solve!
>
> I'm still just not convinced this needs to be done in Airflow, at least
> not with a first class feature. As interesting as I think your microservice
> analogy is, Airflow is not a microservice component, it is a (very, very)
> fancy cron scheduler. And I'm not sure the complexity is worth the use
> case. Since any new code added to Airflow must be maintained by this
> community and we must be cautious that any new pieces serves enough use
> cases/users to make it worth it.
> To me this should either be managed outside of an individual Airflow
> environment e.g. you have an entirely separate staging/gamma/dev Airflow
> environment, which is exposed to some level of production traffic (to
> borrow your microservice analogy) until it can graduate to the production
> environment. And if you really need on the fly toggling of a version, as
> you say, Airflow does this quite responsively, if you deploy a new version
> of your dags it will parse and start using that new version immediately
> (the problem you're trying to solve can be a benefit here). You can even
> have multiple versions of your dags deployed at once and use configuration
> to control which dag directory Airflow reads from (or move/symlink Dags in
> and out of the Dags directory as needed from a known good or pinned
> source). Or use variables or some other parameter store to control other
> pieces of runtime behaviour inside the Dags themselves. Between CI/CD, dev
> ops and making use of existing Airflow primitives I think you can achieve
> what you're looking for.
>
> But as always, this is open and community based software, so I'm happy to
> disagree and commit if the rest of the community thinks this is a valuable
> feature :)
>
> Cheers,
> Niko
> ________________________________
> From: Piyush Maheshwari <[email protected]>
> Sent: Tuesday, April 21, 2026 10:46 PM
> To: [email protected] <[email protected]>
> Subject: RE: [EXT] [DISCUSS] DAG Version Pinning for Deployment Gating
> (Building on AIP-63)
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez
> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que
> le contenu ne présente aucun risque.
>
>
>
> Hi Ephraim, Jarek, Jens, and Niko,
>
> Thank you for the candid feedback. I want to clarify a few things, as I
> completely agree with Jens and Niko that "testing in production" is an
> anti-pattern. That is absolutely not the intention here.
>
> 1. I view this as bringing standard microservice-like deployment maturity
> to DAGs.
> Before service deployments in our org, code is tested locally, in a dev
> environment, and via strict unit/e2e integration tests before it ever makes
> it to main. But even after merging and passing those CI pipelines, we still
> use load tests, pre-prod soak times, shadow traffic, and gated production
> rollouts with automated rollback triggers. Having deployment gates for the
> production environment doesn't mean the pre-merge checks weren't strict or
> that the change wasn't tested beforehand -- it just allows us to place
> additional safety gates for the code to take effect, exactly like in the
> service world.
>
> 2. The core issue we are trying to solve is that Airflow currently
> inseparably links Code Distribution (a file arriving on the dag-processor
> and being parsed) with Release Activation (the scheduler executing that
> code).
> To extend the microservices analogy, I can think of the DAG processor
> parsing all files as "building the artifact(s)," while the scheduler and
> executor acting on the DAG versions created thereafter as "deploying" or
> running the changed code.
> We simply want to decouple the build from the deployment. This does not
> mean that the code arriving on the dag-processor will be tested for the
> first time straight in production. It should've already passed a set of
> checks in the CI pipeline.
>
> 3. It is also worth calling out that Airflow already supports this
> decoupled behavior at the run level for task re-runs and mid-execution DAG
> version bumps (by pinning the version for the rest of the execution or the
> rerun). We are simply trying to expose this existing capability at the DAG
> level so users can govern which version new scheduled runs are created
> with.
>
> 4. I also agree that Airflow itself should not be aware of our CI/CD
> pipeline, nor would it manage the deployment orchestration or testing.
> For our requirements, I just need Airflow to expose APIs to deploy (pin) a
> DAG version, and to remove the pin (to restore/enable the default
> "auto-deploy latest" behavior).
> Beyond that, we intend to use an external release orchestrator that can
> explicitly tell Airflow when a parsed version is actually allowed to run.
> Until that API call is made, the previously pinned version remains active.
> This ensures we don't introduce assumptions or awareness of the presence of
> any external gating mechanisms to Airflow.
> Also note that the intention is to keep the default auto-deploy behavior
> unless a user (or a system on their behalf) explicitly asks Airflow to pin
> a DAG to a specific version.
>
> 5. Most importantly, this feature provides an incident response "rollback"
> behavior. If a bad DAG version slips through CI/CD into production, either
> an on-call engineer or a rollback-trigger (airflow-external) can instantly
> roll back to the previous pinned version via the API/UI to mitigate.
> Without this, users have to revert the code in Git and wait for the entire
> CI/CD pipeline and file-sync process to run, which is often too slow during
> an outage.
>
> 6. Jarek - You are right, database schema changes can be discussed later.
> My intention was only to share a very brief summary of how I deemed it to
> be technically feasible for early feedback. I did briefly share the
> high-level use cases ("Safe Deployment Gating" and "Instant Rollbacks") in
> the original mail, but I completely agree that aligning on the UX first
> would be a good next step.
>
> If there are no major remaining concerns after this response, I can draft
> and share an AIP to detail the UX, followed by a high-level proposal,
> caveats and next steps.
>
> Thanks for your time.
> Regards,
> Piyush
>
> On Tue, Apr 21, 2026 at 5:59 PM Oliveira, Niko <[email protected]>
> wrote:
>
> > I am with Jens on this one. I think we're complicating Airflow to get
> > around a bad practice. If stability of your Dags is critical and they are
> > highly versioned then I think as Jens suggested running them through a
> > pipeline that first deploys them to a dev or gamma environment which
> > verifies that quality of the Dags is what you expect. If something slips
> > through, then it's just normal software practices of either reverting and
> > rolling back or rolling forward with a fix pushed through the pipeline. I
> > don't think Airflow should be aware of that process or opinionated about
> it.
> >
> > Cheers,
> > Niko
> > ------------------------------
> > *From:* Jens Scheffler <[email protected]>
> > *Sent:* Monday, April 20, 2026 11:17 AM
> > *To:* [email protected] <[email protected]>
> > *Subject:* RE: [EXT] [DISCUSS] DAG Version Pinning for Deployment Gating
> > (Building on AIP-63)
> >
> > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> >
> >
> >
> > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> pouvez
> > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> que
> > le contenu ne présente aucun risque.
> >
> >
> >
> > Hi,
> >
> > I am still quite sceptical. Yes, if such pinning is made, then per Dag a
> > change need to be possible via UI and API. But I still see it as
> > checken-and-egg - so you want to run a pinned version but then how do
> > you test the changes (w/o moving a version pin)? Then again some test
> > mode is needed or per run you need to make a "test run" with another
> > version. Smells a bit like mis-using a production system for testing.
> >
> > On the other hand, yes if all Dags share the same Git repo then merging
> > a branch to some other will switch all Dags at the same time. Still you
> > could utilize standard Git tools and cherry-pick individual changes and
> > no force to always make a full rollout. At least 80% possible with
> > standard CI/CD tools and Git.
> >
> > TLDR I see the danger that instead of a proper CI/CD and test system
> > such a feature might feel like you can easily test on a production
> > system. Effectively it would be needed allowing to start a Dag with any
> > version to also be able to jump back as a reversion. Even though, yes,
> > agree, all is technically possible.
> >
> > Jens
> >
> > On 20.04.26 16:40, Jarek Potiuk wrote:
> > > +1 to what Ephraim wrote. I think that was a natural next step we
> > > discussed, but it needs significant refinement, starting with the
> actual
> > > use cases it should serve and the UX for user interaction. I think
> > related
> > > database changes are pretty secondary. Use cases cover runs, re-runs,
> > > backfills, CI testing, rollbacks, etc. Following the "documentation
> > first"
> > > approach discussed in separate thread, describing the context and
> > intention
> > > of what we want to achieve is much more important than DB schema
> changes.
> > > Once we know which use cases we want to serve, the DB schema changes
> and
> > > other related items will emerge naturally.
> > >
> > > On Mon, Apr 20, 2026 at 3:15 PM Ephraim Anierobi <
> > [email protected]>
> > > wrote:
> > >
> > >> Hi Piyush, thanks for starting this discussion.
> > >>
> > >> I like the proposal. We can introduce an active execution version for
> > >> "versioned bundles" and make scheduler/API resolve through it. The
> hard
> > >> part of this is making airflow able to distinguish the latest parsed
> > >> dagmodel's metadata from active scheduling metadata. I will suggest
> you
> > >> draft this in a google docs and share for further discussions.
> > >>
> > >> Regards
> > >> - Ephraim
> > >>
> > >> On Mon, 20 Apr 2026 at 01:31, Piyush Maheshwari <
> [email protected]>
> > >> wrote:
> > >>
> > >>> Thanks for sharing your thoughts Jens.
> > >>>
> > >>>> be able to test it? … a Q&A/Testing environment to be able to
> sign-off
> > >>> changes.
> > >>> Yes, we’ve have built an isolated airflow environment to run
> regression
> > >>> checks before promoting to production.
> > >>>
> > >>> As you suggested, we’re already running both generic and DAG-custom
> > >> static
> > >>> checks in a CI job as a required step to merge to the main branch.
> > >>>
> > >>>> But then the "main" branch might be best suited if
> > >>> implemented on the test system
> > >>> In this case, problematic commits on “main” can choke other unrelated
> > >>> changes.
> > >>> So the other option would be to revert the problematic commits and
> > deploy
> > >>> forward.
> > >>>
> > >>> However, a key limitation with this approach that remains is that a
> > >> commit
> > >>> affecting multiple DAGs goes live for either all DAGs or none.
> > >>>
> > >>> Second important feature we get with this is instant DAG-level
> rollback
> > >>> without waiting for a revert commit to merge and be picked by
> airflow.
> > >>>
> > >>> I think DAG-level version pinning can also unlock a lot of
> flexibility
> > >> for
> > >>> deployments including tiered rollouts, auto-rollback triggers, timed
> > >>> deployment windows and so on.
> > >>>
> > >>> Looking forward to hear your thoughts.
> > >>> Regards,
> > >>> Piyush
> > >>>
> > >>> On Sun, 19 Apr 2026 at 3:12 PM, Jens Scheffler <[email protected]>
> > >>> wrote:
> > >>>
> > >>>> Thanks Piyush for dropping the discussion!
> > >>>>
> > >>>> I think in general QA processes are important and a valid use case.
> So
> > >> a
> > >>>> kind of pinning Dag versions really is important.
> > >>>>
> > >>>> Thinking about it, if you pin the version ... how would you then be
> > >> able
> > >>>> to test it? I assume you would need (and should have or invest
> into) a
> > >>>> Q&A/Testing environment to be able to sign-off changes. Both in
> > >>>> infrastructure but also for Dag changes.
> > >>>>
> > >>>> If you are changing Dags first of all static checks on Dag code are
> > >> very
> > >>>> much proposed as well as you can have tests implemented and test
> your
> > >>>> Dags and logic. Similar like software a CI/CD system will be a good
> > >>>> setup. Alongside Dag changes also have logical changes that mostly
> can
> > >>>> only be tested in a live system and not as static checks.
> > >>>>
> > >>>> Have you considered using Git and a set of branches for implementing
> > >>>> such staging? E.g. you have a git repo and you plan to make changes.
> > >>>> Then you would open a PR for the change and merge it to the "main"
> > >>>> branch - and there in your CI/CD you can check all sorts of static
> > >>>> checks and tests. But then the "main" branch might be best suited if
> > >>>> implemented on the test system. Once you validate the changes
> > >> end-to-end
> > >>>> you could make another PR for example to a "prod" branch. And if
> your
> > >>>> production system is only pulling Dags from the "prod" branch then
> you
> > >>>> can have this merging strategy as a staging setup.
> > >>>>
> > >>>> Would this resolve your PING problem? Or which other detail in the
> use
> > >>>> case would require a PIN on top of a staging strategy?
> > >>>>
> > >>>> Jens
> > >>>>
> > >>>> P.S.: Have enabled your confluence account after it was created in
> > >> order
> > >>>> to write to Confluence, sorry, typical pitfall after account
> creation
> > >>>> permissions were not set. Now it should work. Let me know if not.
> > >>>>
> > >>>> On 19.04.26 01:40, Piyush Maheshwari wrote:
> > >>>>> Hi everyone,
> > >>>>> I'm a new contributor to Airflow. I'd like to propose a new feature
> > >> for
> > >>>> Airflow: DAG Version Pinning.
> > >>>>> Building on the foundation introduced by AIP-63: DAG Versioning (
> > >>
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-63%3A+DAG+Versioning
> > >>> ),
> > >>>> this proposal aims to extend Airflow's capabilities to support true
> > >>>> continuous deployment (CD) gating and safer release cycles.
> > >>>>> The Problem & Use Cases
> > >>>>> Currently, the scheduler always creates DagRuns using the latest
> > >> parsed
> > >>>> DagVersion. This means that the updated DAG code is deployed (takes
> > >>> effect)
> > >>>> right after the dag-processor processes it. While this is great for
> > >> rapid
> > >>>> development, teams running business-critical pipelines often need
> > >>> stricter
> > >>>> deployment mechanisms. Specifically:
> > >>>>>     *
> > >>>>> Safe Deployment Gating: The ability to pin a DAG to its last known
> > >>>> stable version while new code is parsed in the background. This
> allows
> > >>> the
> > >>>> new version to be held back until it passes automated regression
> tests
> > >> or
> > >>>> receives explicit manual approval.
> > >>>>>     *
> > >>>>> Instant Rollbacks: If an issue is detected in a newly promoted DAG
> > >>>> version, users need the capability to instantly roll back to a
> > previous
> > >>>> version via the UI/API, without having to revert the underlying code
> > >> and
> > >>>> wait for the repository sync and DAG processing cycle.
> > >>>>> High-Level Proposed Solution
> > >>>>> Introduce an optional active_dag_version_id to the DagModel. This
> > >> field
> > >>>> can be used to pin a DAG version for scheduling and execution, while
> > >> the
> > >>>> dag-processor can continue to parse and register newer DAG versions.
> > >>>>>     *
> > >>>>> When this pin is set, the scheduler and API will respect the pinned
> > >>>> version for creating runs and executing tasks, separating the
> parsing
> > >> of
> > >>>> new code from the execution of new code.
> > >>>>>     *
> > >>>>> If the pin is NULL, the system defaults to the current behavior
> > >> (always
> > >>>> executing the latest parsed version). This way, we can maintain
> > >> complete
> > >>>> backwards compatibility.
> > >>>>> I have put together some detailed notes covering the data model
> > >>> changes,
> > >>>> database migrations, and edge cases with this approach. If there is
> > >>> general
> > >>>> alignment that this fits the vision for Airflow, I would like to
> take
> > >>> this
> > >>>> proposal through the formal AIP review process.
> > >>>>> But I would love to get the community's feedback on the feature and
> > >> the
> > >>>> high-level approach.
> > >>>>> I'll also need someone to grant me access to create content on the
> > >>>> Airflow Confluence wiki.
> > >>>>> Thanks for your time!
> > >>>>> Regards,
> > >>>>> Piyush
> > >>>>>
> > >>>>
> ---------------------------------------------------------------------
> > >>>> To unsubscribe, e-mail: [email protected]
> > >>>> For additional commands, e-mail: [email protected]
> > >>>>
> > >>>>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> >
>

Reply via email to