At first glance I tend to agree with Jens and Niko. I understand the request, but I agree that this resolves CI/CD and testing issues that should probably be remain outside Airflow.
On Mon, Apr 27, 2026 at 7:43 PM Oliveira, Niko <[email protected]> wrote: > Hey folks! > > > P.S. In my opinion, what can be done in/around git, should be done > there. Recreation of CI/CD in any form inside of Airflow itself is > something which should not be done. > > I'm glad we agree on this :) I suppose we just disagree on what is > possible outside of Airflow :p > > But at this point I will bow out of the conversation and let others weigh > in. I'm not fully convinced any of these requested behaviours require > changes to Airflow (I think that's just masking some dev ops work). But > also I'm not completely opposed to the change either, I'm more on the > fence, so if others love the feature by all means implement it! :) > > Cheers, > Niko > ________________________________ > From: Przemysław Mirowski <[email protected]> > Sent: Thursday, April 23, 2026 3:06 PM > To: [email protected] <[email protected]> > Subject: RE: [EXT] [DISCUSS] DAG Version Pinning for Deployment Gating > (Building on AIP-63) > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que > le contenu ne présente aucun risque. > > > > Hi, > > I think that CI/CD and version pining are a little two different things > here. In a use cases with some critical systems involved, the situation > when the Dag changes the version to the latest without possibility to > determine when it will exactly happen (CI/CD will have some more-or-less > time to deploy the change, the same goes for Dag Processor parsing time) is > rather hard to do and in some systems it can make change deployment harder > and less safe. Of course, the ideal solution would be to have proper > non-prod environment, which is fully representative in comparison to > production (in some cases exposing non-prod to prod data/traffic/etc. is, > just, not an option - e.g. security), but it is not always possible to do > due to various reasons like costs, licenses, space and/or vendors. I'm > agreeing especially with point 5 of Piyush latest message. Having above in > mind, I think that version pinning would be a nice addition to the Dag > Versioning feature with an assumption that it is for critical Airflow Dags > when full control of the Dags version change time is required (maybe there > is also another way to achieve that). > > P.S. In my opinion, what can be done in/around git, should be done there. > Recreation of CI/CD in any form inside of Airflow itself is something which > should not be done. > ________________________________ > From: Oliveira, Niko <[email protected]> > Sent: 23 April 2026 01:50 > To: [email protected] <[email protected]> > Subject: Re: [DISCUSS] DAG Version Pinning for Deployment Gating (Building > on AIP-63) > > Hey Piyush, > > Thanks for your reply, I do love how clearly it is written and I see > exactly the problem you're trying to solve! > > I'm still just not convinced this needs to be done in Airflow, at least > not with a first class feature. As interesting as I think your microservice > analogy is, Airflow is not a microservice component, it is a (very, very) > fancy cron scheduler. And I'm not sure the complexity is worth the use > case. Since any new code added to Airflow must be maintained by this > community and we must be cautious that any new pieces serves enough use > cases/users to make it worth it. > To me this should either be managed outside of an individual Airflow > environment e.g. you have an entirely separate staging/gamma/dev Airflow > environment, which is exposed to some level of production traffic (to > borrow your microservice analogy) until it can graduate to the production > environment. And if you really need on the fly toggling of a version, as > you say, Airflow does this quite responsively, if you deploy a new version > of your dags it will parse and start using that new version immediately > (the problem you're trying to solve can be a benefit here). You can even > have multiple versions of your dags deployed at once and use configuration > to control which dag directory Airflow reads from (or move/symlink Dags in > and out of the Dags directory as needed from a known good or pinned > source). Or use variables or some other parameter store to control other > pieces of runtime behaviour inside the Dags themselves. Between CI/CD, dev > ops and making use of existing Airflow primitives I think you can achieve > what you're looking for. > > But as always, this is open and community based software, so I'm happy to > disagree and commit if the rest of the community thinks this is a valuable > feature :) > > Cheers, > Niko > ________________________________ > From: Piyush Maheshwari <[email protected]> > Sent: Tuesday, April 21, 2026 10:46 PM > To: [email protected] <[email protected]> > Subject: RE: [EXT] [DISCUSS] DAG Version Pinning for Deployment Gating > (Building on AIP-63) > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que > le contenu ne présente aucun risque. > > > > Hi Ephraim, Jarek, Jens, and Niko, > > Thank you for the candid feedback. I want to clarify a few things, as I > completely agree with Jens and Niko that "testing in production" is an > anti-pattern. That is absolutely not the intention here. > > 1. I view this as bringing standard microservice-like deployment maturity > to DAGs. > Before service deployments in our org, code is tested locally, in a dev > environment, and via strict unit/e2e integration tests before it ever makes > it to main. But even after merging and passing those CI pipelines, we still > use load tests, pre-prod soak times, shadow traffic, and gated production > rollouts with automated rollback triggers. Having deployment gates for the > production environment doesn't mean the pre-merge checks weren't strict or > that the change wasn't tested beforehand -- it just allows us to place > additional safety gates for the code to take effect, exactly like in the > service world. > > 2. The core issue we are trying to solve is that Airflow currently > inseparably links Code Distribution (a file arriving on the dag-processor > and being parsed) with Release Activation (the scheduler executing that > code). > To extend the microservices analogy, I can think of the DAG processor > parsing all files as "building the artifact(s)," while the scheduler and > executor acting on the DAG versions created thereafter as "deploying" or > running the changed code. > We simply want to decouple the build from the deployment. This does not > mean that the code arriving on the dag-processor will be tested for the > first time straight in production. It should've already passed a set of > checks in the CI pipeline. > > 3. It is also worth calling out that Airflow already supports this > decoupled behavior at the run level for task re-runs and mid-execution DAG > version bumps (by pinning the version for the rest of the execution or the > rerun). We are simply trying to expose this existing capability at the DAG > level so users can govern which version new scheduled runs are created > with. > > 4. I also agree that Airflow itself should not be aware of our CI/CD > pipeline, nor would it manage the deployment orchestration or testing. > For our requirements, I just need Airflow to expose APIs to deploy (pin) a > DAG version, and to remove the pin (to restore/enable the default > "auto-deploy latest" behavior). > Beyond that, we intend to use an external release orchestrator that can > explicitly tell Airflow when a parsed version is actually allowed to run. > Until that API call is made, the previously pinned version remains active. > This ensures we don't introduce assumptions or awareness of the presence of > any external gating mechanisms to Airflow. > Also note that the intention is to keep the default auto-deploy behavior > unless a user (or a system on their behalf) explicitly asks Airflow to pin > a DAG to a specific version. > > 5. Most importantly, this feature provides an incident response "rollback" > behavior. If a bad DAG version slips through CI/CD into production, either > an on-call engineer or a rollback-trigger (airflow-external) can instantly > roll back to the previous pinned version via the API/UI to mitigate. > Without this, users have to revert the code in Git and wait for the entire > CI/CD pipeline and file-sync process to run, which is often too slow during > an outage. > > 6. Jarek - You are right, database schema changes can be discussed later. > My intention was only to share a very brief summary of how I deemed it to > be technically feasible for early feedback. I did briefly share the > high-level use cases ("Safe Deployment Gating" and "Instant Rollbacks") in > the original mail, but I completely agree that aligning on the UX first > would be a good next step. > > If there are no major remaining concerns after this response, I can draft > and share an AIP to detail the UX, followed by a high-level proposal, > caveats and next steps. > > Thanks for your time. > Regards, > Piyush > > On Tue, Apr 21, 2026 at 5:59 PM Oliveira, Niko <[email protected]> > wrote: > > > I am with Jens on this one. I think we're complicating Airflow to get > > around a bad practice. If stability of your Dags is critical and they are > > highly versioned then I think as Jens suggested running them through a > > pipeline that first deploys them to a dev or gamma environment which > > verifies that quality of the Dags is what you expect. If something slips > > through, then it's just normal software practices of either reverting and > > rolling back or rolling forward with a fix pushed through the pipeline. I > > don't think Airflow should be aware of that process or opinionated about > it. > > > > Cheers, > > Niko > > ------------------------------ > > *From:* Jens Scheffler <[email protected]> > > *Sent:* Monday, April 20, 2026 11:17 AM > > *To:* [email protected] <[email protected]> > > *Subject:* RE: [EXT] [DISCUSS] DAG Version Pinning for Deployment Gating > > (Building on AIP-63) > > > > CAUTION: This email originated from outside of the organization. Do not > > click links or open attachments unless you can confirm the sender and > know > > the content is safe. > > > > > > > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne > pouvez > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain > que > > le contenu ne présente aucun risque. > > > > > > > > Hi, > > > > I am still quite sceptical. Yes, if such pinning is made, then per Dag a > > change need to be possible via UI and API. But I still see it as > > checken-and-egg - so you want to run a pinned version but then how do > > you test the changes (w/o moving a version pin)? Then again some test > > mode is needed or per run you need to make a "test run" with another > > version. Smells a bit like mis-using a production system for testing. > > > > On the other hand, yes if all Dags share the same Git repo then merging > > a branch to some other will switch all Dags at the same time. Still you > > could utilize standard Git tools and cherry-pick individual changes and > > no force to always make a full rollout. At least 80% possible with > > standard CI/CD tools and Git. > > > > TLDR I see the danger that instead of a proper CI/CD and test system > > such a feature might feel like you can easily test on a production > > system. Effectively it would be needed allowing to start a Dag with any > > version to also be able to jump back as a reversion. Even though, yes, > > agree, all is technically possible. > > > > Jens > > > > On 20.04.26 16:40, Jarek Potiuk wrote: > > > +1 to what Ephraim wrote. I think that was a natural next step we > > > discussed, but it needs significant refinement, starting with the > actual > > > use cases it should serve and the UX for user interaction. I think > > related > > > database changes are pretty secondary. Use cases cover runs, re-runs, > > > backfills, CI testing, rollbacks, etc. Following the "documentation > > first" > > > approach discussed in separate thread, describing the context and > > intention > > > of what we want to achieve is much more important than DB schema > changes. > > > Once we know which use cases we want to serve, the DB schema changes > and > > > other related items will emerge naturally. > > > > > > On Mon, Apr 20, 2026 at 3:15 PM Ephraim Anierobi < > > [email protected]> > > > wrote: > > > > > >> Hi Piyush, thanks for starting this discussion. > > >> > > >> I like the proposal. We can introduce an active execution version for > > >> "versioned bundles" and make scheduler/API resolve through it. The > hard > > >> part of this is making airflow able to distinguish the latest parsed > > >> dagmodel's metadata from active scheduling metadata. I will suggest > you > > >> draft this in a google docs and share for further discussions. > > >> > > >> Regards > > >> - Ephraim > > >> > > >> On Mon, 20 Apr 2026 at 01:31, Piyush Maheshwari < > [email protected]> > > >> wrote: > > >> > > >>> Thanks for sharing your thoughts Jens. > > >>> > > >>>> be able to test it? … a Q&A/Testing environment to be able to > sign-off > > >>> changes. > > >>> Yes, we’ve have built an isolated airflow environment to run > regression > > >>> checks before promoting to production. > > >>> > > >>> As you suggested, we’re already running both generic and DAG-custom > > >> static > > >>> checks in a CI job as a required step to merge to the main branch. > > >>> > > >>>> But then the "main" branch might be best suited if > > >>> implemented on the test system > > >>> In this case, problematic commits on “main” can choke other unrelated > > >>> changes. > > >>> So the other option would be to revert the problematic commits and > > deploy > > >>> forward. > > >>> > > >>> However, a key limitation with this approach that remains is that a > > >> commit > > >>> affecting multiple DAGs goes live for either all DAGs or none. > > >>> > > >>> Second important feature we get with this is instant DAG-level > rollback > > >>> without waiting for a revert commit to merge and be picked by > airflow. > > >>> > > >>> I think DAG-level version pinning can also unlock a lot of > flexibility > > >> for > > >>> deployments including tiered rollouts, auto-rollback triggers, timed > > >>> deployment windows and so on. > > >>> > > >>> Looking forward to hear your thoughts. > > >>> Regards, > > >>> Piyush > > >>> > > >>> On Sun, 19 Apr 2026 at 3:12 PM, Jens Scheffler <[email protected]> > > >>> wrote: > > >>> > > >>>> Thanks Piyush for dropping the discussion! > > >>>> > > >>>> I think in general QA processes are important and a valid use case. > So > > >> a > > >>>> kind of pinning Dag versions really is important. > > >>>> > > >>>> Thinking about it, if you pin the version ... how would you then be > > >> able > > >>>> to test it? I assume you would need (and should have or invest > into) a > > >>>> Q&A/Testing environment to be able to sign-off changes. Both in > > >>>> infrastructure but also for Dag changes. > > >>>> > > >>>> If you are changing Dags first of all static checks on Dag code are > > >> very > > >>>> much proposed as well as you can have tests implemented and test > your > > >>>> Dags and logic. Similar like software a CI/CD system will be a good > > >>>> setup. Alongside Dag changes also have logical changes that mostly > can > > >>>> only be tested in a live system and not as static checks. > > >>>> > > >>>> Have you considered using Git and a set of branches for implementing > > >>>> such staging? E.g. you have a git repo and you plan to make changes. > > >>>> Then you would open a PR for the change and merge it to the "main" > > >>>> branch - and there in your CI/CD you can check all sorts of static > > >>>> checks and tests. But then the "main" branch might be best suited if > > >>>> implemented on the test system. Once you validate the changes > > >> end-to-end > > >>>> you could make another PR for example to a "prod" branch. And if > your > > >>>> production system is only pulling Dags from the "prod" branch then > you > > >>>> can have this merging strategy as a staging setup. > > >>>> > > >>>> Would this resolve your PING problem? Or which other detail in the > use > > >>>> case would require a PIN on top of a staging strategy? > > >>>> > > >>>> Jens > > >>>> > > >>>> P.S.: Have enabled your confluence account after it was created in > > >> order > > >>>> to write to Confluence, sorry, typical pitfall after account > creation > > >>>> permissions were not set. Now it should work. Let me know if not. > > >>>> > > >>>> On 19.04.26 01:40, Piyush Maheshwari wrote: > > >>>>> Hi everyone, > > >>>>> I'm a new contributor to Airflow. I'd like to propose a new feature > > >> for > > >>>> Airflow: DAG Version Pinning. > > >>>>> Building on the foundation introduced by AIP-63: DAG Versioning ( > > >> > > > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-63%3A+DAG+Versioning > > >>> ), > > >>>> this proposal aims to extend Airflow's capabilities to support true > > >>>> continuous deployment (CD) gating and safer release cycles. > > >>>>> The Problem & Use Cases > > >>>>> Currently, the scheduler always creates DagRuns using the latest > > >> parsed > > >>>> DagVersion. This means that the updated DAG code is deployed (takes > > >>> effect) > > >>>> right after the dag-processor processes it. While this is great for > > >> rapid > > >>>> development, teams running business-critical pipelines often need > > >>> stricter > > >>>> deployment mechanisms. Specifically: > > >>>>> * > > >>>>> Safe Deployment Gating: The ability to pin a DAG to its last known > > >>>> stable version while new code is parsed in the background. This > allows > > >>> the > > >>>> new version to be held back until it passes automated regression > tests > > >> or > > >>>> receives explicit manual approval. > > >>>>> * > > >>>>> Instant Rollbacks: If an issue is detected in a newly promoted DAG > > >>>> version, users need the capability to instantly roll back to a > > previous > > >>>> version via the UI/API, without having to revert the underlying code > > >> and > > >>>> wait for the repository sync and DAG processing cycle. > > >>>>> High-Level Proposed Solution > > >>>>> Introduce an optional active_dag_version_id to the DagModel. This > > >> field > > >>>> can be used to pin a DAG version for scheduling and execution, while > > >> the > > >>>> dag-processor can continue to parse and register newer DAG versions. > > >>>>> * > > >>>>> When this pin is set, the scheduler and API will respect the pinned > > >>>> version for creating runs and executing tasks, separating the > parsing > > >> of > > >>>> new code from the execution of new code. > > >>>>> * > > >>>>> If the pin is NULL, the system defaults to the current behavior > > >> (always > > >>>> executing the latest parsed version). This way, we can maintain > > >> complete > > >>>> backwards compatibility. > > >>>>> I have put together some detailed notes covering the data model > > >>> changes, > > >>>> database migrations, and edge cases with this approach. If there is > > >>> general > > >>>> alignment that this fits the vision for Airflow, I would like to > take > > >>> this > > >>>> proposal through the formal AIP review process. > > >>>>> But I would love to get the community's feedback on the feature and > > >> the > > >>>> high-level approach. > > >>>>> I'll also need someone to grant me access to create content on the > > >>>> Airflow Confluence wiki. > > >>>>> Thanks for your time! > > >>>>> Regards, > > >>>>> Piyush > > >>>>> > > >>>> > --------------------------------------------------------------------- > > >>>> To unsubscribe, e-mail: [email protected] > > >>>> For additional commands, e-mail: [email protected] > > >>>> > > >>>> > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > >
