Hi everyone!

First of all, I’d like to thank all the participants in this discussion.

Based on what I’ve read, am I understanding correctly that in order for
this PR to be merged into the main Airflow codebase, I need to:

– add usage examples to the Airflow documentation,

– add display of this variable in the UI(the main question where it should
be)?

I also have a question: should the UI only show the current state of
backend_order without allowing it to be edited?

At the moment, I’m maintaining this PR and I’m ready to make the necessary
improvements.

Best regards,
Anton Nitochkin

On Mon, 7 Jul 2025 at 11:19, Amogh Desai <amoghdesai....@gmail.com> wrote:

> I also think it would be beneficial for users / someone editing / accessing
> connections or variables
> from the UI to know "where" they are editing it.
>
> Right now it's the metadata DB but with the proposed PR that probably could
> change (for cases when
> DB is highest priority / middle priority?)
>
> But generally speaking, DAG authors at any point need not know where they
> are getting connections / variables
> from in a happy path scenario, but things will change if something starts
> to fail and it really depends on who is
> debugging the failure :)! The deployment manager can go and run the
> `airflow config get-value` command, but I am guessing
> most DAG authors wouldn't / shouldn't be able to do that.
>
> So in short, the idea makes sense theoretically to me, but it needs much
> more work, mainly in terms of:
> - Doc clarification
> - Debugging assistance (how to know the order?): it's a more general
> problem not due to the task but
> similar / related to this
> - Considering the worker backend angle
>
> Thanks & Regards,
> Amogh Desai
>
>
> On Sun, Jul 6, 2025 at 11:39 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > I think the only real "behavioural" change that you might expect from the
> > user if they "know" what is the sequence is at the connection / variable
> > UI. This is where the user (with connection/variable editing capabilities
> > or connection/variable viewing capabilities) might actually make a
> > different decision or draw a different conclusion. So my proposal would
> be
> > to explain the sequence - in possibly some concise way - at the
> > connection/variable screen.
> >
> > And that seems both natural and obvious.
> >
> > Is that "enough" for you ? Or do you think other places need "surfacing"
> ?
> > What other behaviour of the users (different actors) you see might be
> > impacted by lack / presence of the information?
> >
> > J.
> >
> >
> > On Sun, Jul 6, 2025 at 5:42 PM Elad Kalif <elad...@apache.org> wrote:
> >
> > > > This seems like
> > > organisation-wide policy that simply all DAG authors in the
> organization
> > > should be made aware of
> > >
> > > One among several other things that the admin expects users to
> remember.
> > We
> > > should reduce it, not increase it.
> > > From my point of view this setting adds a blind spot. I am not happy
> with
> > > this.
> > > I have similar feelings towards cluster policies, yet another blind
> spot
> > > that dag authors should be aware of but no actual tools provided to see
> > the
> > > override in their side.
> > >
> > > I initially shared my thoughts on 31 March in
> > > https://github.com/apache/airflow/pull/45931#discussion_r2021018760
> > > So far I haven't seen any comments that explain why we can't implement
> > such
> > > a mechanism. Is it technically complicated? Is it high effort? or
> > > the assumption is that it serves little value?
> > >
> > >
> > > On Sun, Jul 6, 2025 at 3:12 PM Jarek Potiuk <ja...@potiuk.com> wrote:
> > >
> > > > > I am missing the part of how can DAG Author be aware of the backend
> > > order
> > > > the cluster admin chooses?
> > > > > This is a crucial part
> > > >
> > > > I am not sure there is a special need for it. This seems like
> > > > organisation-wide policy that simply all DAG authors in the
> > organization
> > > > should be made aware of - it has 0 impact on the way how DAGs are
> > > written.
> > > > If it would be different for different DAGs you'd surely need to
> > > > communicate this, but I am not sure if any other indication is
> needed.
> > > It's
> > > > largely transparent for `DAG authors` if you ask me - they want a
> > > > connection by id and the "organizational policy" decides how this
> > > happens.
> > > >
> > > > J.
> > > >
> > > >
> > > > On Sun, Jul 6, 2025 at 2:06 PM Elad Kalif <elad...@apache.org>
> wrote:
> > > >
> > > > > I am missing the part of how can DAG Author be aware of the backend
> > > order
> > > > > the cluster admin chooses?
> > > > > This is a crucial part.
> > > > >
> > > > > On Thu, Jul 3, 2025 at 12:14 PM Jarek Potiuk <ja...@potiuk.com>
> > wrote:
> > > > >
> > > > > > Sorry for typos - that was my mobile auto complete... I hope it
> is
> > > > > > understandable anyway
> > > > > >
> > > > > > czw., 3 lip 2025, 11:13 użytkownik Jarek Potiuk <
> ja...@potiuk.com>
> > > > > > napisał:
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > czw., 3 lip 2025, 10:14 użytkownik Amogh Desai <
> > > > > amoghdesai....@gmail.com
> > > > > > >
> > > > > > > napisał:
> > > > > > >
> > > > > > >> Thanks for that angle, Jarek.
> > > > > > >>
> > > > > > >> Lets say DB lookup has higher precedence than that of say ENV
> > > > backend.
> > > > > > >> Wouldn't this be shooting ourselves in the foot by
> compromising
> > > the
> > > > > > >> performance here? DB lookup
> > > > > > >> will be more expensive than DB.
> > > > > > >>
> > > > > > >>
> > > > > > > Oh absolutely. I think if we have this possibility of managing
> > > order
> > > > > > those
> > > > > > > kind of scenarios alshould be explained in the docs so that
> users
> > > do
> > > > > not
> > > > > > > shoot themselves in a foot
> > > > > > >
> > > > > > > Also following my mail about multi team. I started to think
> > > recently
> > > > -
> > > > > > > looking at some other OSS software thetwe sometimes take too
> much
> > > > > > > responsibility for our users and the snuffer be cause we have
> to
> > > > defend
> > > > > > out
> > > > > > > opinionated choices when there are use cases that outlet
> choices
> > do
> > > > not
> > > > > > > enable.
> > > > > > >
> > > > > > > This is the reason why we have so many 'options' and config
> > values
> > > > > > because
> > > > > > > sometimes we do not want to make decisions for our users - but
> > > where
> > > > we
> > > > > > can
> > > > > > > make it an option and configuration and clearly explain to o
> lut
> > > > users
> > > > > > (and
> > > > > > > mostly I am talking about Deployment Manager role from our
> > security
> > > > > > model).
> > > > > > > - it's their responsibility to read all the information we
> > provide
> > > > and
> > > > > > > follow it when they make decisions on how to configure Airflow
> -
> > > > > knowing
> > > > > > > the consequences. And we should be 'harsh' with them - in the
> > sense
> > > > > that
> > > > > > if
> > > > > > > they did not read the docs and did not understand it - any time
> > > they
> > > > > ask
> > > > > > > imus about something not working that is explained in the docs
> -
> > we
> > > > > > should
> > > > > > > send them to the doc with 'Read The Friendly Manual' advice -
> > > simply
> > > > > > > because this is the only job they have. And we should not do
> the
> > > job
> > > > > for
> > > > > > > them.
> > > > > > >
> > > > > > > Similarly having operations like that allow our managed service
> > > > > providers
> > > > > > > to make their opinionated choices and make some configuration
> > > options
> > > > > > > possible, some selected for their users in the context of the
> > > service
> > > > > > > managed. But again - that's their responsibility to manage and
> > > > > understand
> > > > > > > what are the options and what they mean. Same as individual
> > > > deployment
> > > > > > > managers - they can make their own decisions - and if it does
> not
> > > > cost
> > > > > > us a
> > > > > > > lot we should make it possible for them to make those choices
> > (and
> > > > take
> > > > > > > responsibility for their choices)
> > > > > > >
> > > > > > > With great powers (of choice) you also have great
> > responsibilities
> > > > (of
> > > > > > > consequences of your choices) - and as long we are aware of
> those
> > > > > > > consequences and communicate it to deployment managers - it's
> on
> > > > their
> > > > > > > shoulders to make the choices and bear the consequences.
> > > > > > >
> > > > > > > J.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > There could also be a few more side effects that we will have
> to
> > > > fully
> > > > > > >> uncover and come up
> > > > > > >> with a detailed plan to allow this to be configurable.
> > > > > > >>
> > > > > > >> Thanks & Regards,
> > > > > > >> Amogh Desai
> > > > > > >>
> > > > > > >>
> > > > > > >> On Wed, Jul 2, 2025 at 6:43 PM Jarek Potiuk <ja...@potiuk.com
> >
> > > > wrote:
> > > > > > >>
> > > > > > >> > I think this is a good idea - but as Ash mentioned, it has
> to
> > be
> > > > > > >> executed
> > > > > > >> > well with a lot of bells and whistles, so that users will
> not
> > > > shoot
> > > > > > >> > themselves in their foot. For example we had recently
> > > discussions
> > > > on
> > > > > > the
> > > > > > >> > new UI whether/how to explain the users that their
> connections
> > > in
> > > > UI
> > > > > > and
> > > > > > >> > API **only** show the DB connections (for good reasons) -
> and
> > it
> > > > is
> > > > > > >> already
> > > > > > >> > difficult to explain to the users, now - this change will
> also
> > > > make
> > > > > it
> > > > > > >> > behave differently (for example - currently when you edit
> > > > connection
> > > > > > >> via UI
> > > > > > >> > it might **not** get into effect if you have same connection
> > > > defined
> > > > > > in
> > > > > > >> the
> > > > > > >> > secret/env var. But if you make DB first - this changes and
> > > there
> > > > > are
> > > > > > >> few
> > > > > > >> > edge-cases where it might have some unexpected effect.
> > > > > > >> >
> > > > > > >> > But there is one inevitable benefit of this approach that I
> > > like -
> > > > > the
> > > > > > >> > ability of turning airflow DB into an effective "shield" for
> > > > secret
> > > > > > >> usage.
> > > > > > >> > The big drawback of the current "sequence" is that airflow
> > > > > generates a
> > > > > > >> LOT
> > > > > > >> > of queries to Secrets' manager, even if your connection is
> > > defined
> > > > > in
> > > > > > >> the
> > > > > > >> > DB - because it will query secrets first. So currently it is
> > not
> > > > > > >> possible
> > > > > > >> > to say "for this, highly frequently used connection I want
> to
> > > keep
> > > > > it
> > > > > > >> in DB
> > > > > > >> > to save on the secret's manager queries - both performance
> and
> > > > cost
> > > > > > >> wise -
> > > > > > >> > because defining connection in the DB does not limit the
> > number
> > > of
> > > > > > >> secret
> > > > > > >> > manager's queries. So in a number of scenarios, being able
> to
> > > > revert
> > > > > > it
> > > > > > >> and
> > > > > > >> > query DB first might be very good for cost and network
> > > > optimisation.
> > > > > > >> >
> > > > > > >> > I think if we describe it (as Ash wrote) well in the docs
> and
> > > > > explain
> > > > > > >> those
> > > > > > >> > scenarios and also clearly communicate it in the UI if
> Airflow
> > > (we
> > > > > > need
> > > > > > >> to
> > > > > > >> > likely have some way of explaining the user what is their
> > > > currently
> > > > > > >> > configured sequence and what they should expect to happen if
> > > they
> > > > > > >> > remove/add connection) - then I see it as a really useful
> > > feature.
> > > > > > >> >
> > > > > > >> > J.
> > > > > > >> >
> > > > > > >> > On Wed, Jul 2, 2025 at 2:54 PM Ash Berlin-Taylor <
> > > a...@apache.org>
> > > > > > >> wrote:
> > > > > > >> >
> > > > > > >> > > At a high level I’m good with allowing this to be fully
> > > > > > configurable,
> > > > > > >> as
> > > > > > >> > > long as we document the possible warts (“Doctor, it hurts
> > > when I
> > > > > do
> > > > > > >> this”
> > > > > > >> > > “well don’t do that then!” etc) — though as Amogh
> mentioned
> > it
> > > > is
> > > > > > >> > slightly
> > > > > > >> > > complicated by the distinction between API
> Server/Scheduler
> > > and
> > > > > the
> > > > > > >> > > execution time on the worker.
> > > > > > >> > >
> > > > > > >> > > (I haven’t looked at the specific implementation yet)
> > > > > > >> > >
> > > > > > >> > > -ash
> > > > > > >> > >
> > > > > > >> > > > On 2 Jul 2025, at 11:56, Amogh Desai <
> > > > amoghdesai....@gmail.com>
> > > > > > >> wrote:
> > > > > > >> > > >
> > > > > > >> > > > Hello Anton,
> > > > > > >> > > >
> > > > > > >> > > > Thanks for kicking off this discussion. I’d love to
> > > understand
> > > > > > your
> > > > > > >> > > > motivations a bit more on this front.
> > > > > > >> > > > From your PR, I am seeing that you are just not allowing
> > > > > addition
> > > > > > of
> > > > > > >> > > > multiple custom backends
> > > > > > >> > > > but also changing the *default_backend* order. I am a
> bit
> > > torn
> > > > > on
> > > > > > >> that
> > > > > > >> > > > part.
> > > > > > >> > > >
> > > > > > >> > > > The current design intentionally places the metadata DB
> > > > backend
> > > > > at
> > > > > > >> the
> > > > > > >> > > > lowest precedence in the order,
> > > > > > >> > > > since it’s meant to serve as the ultimate fallback
> source
> > of
> > > > > > truth.
> > > > > > >> Any
> > > > > > >> > > > additional configured
> > > > > > >> > > > backends are prioritized higher than it by design.
> > > > > > >> > > >
> > > > > > >> > > > With your changes, we now allow configurations like:
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > *    @conf_vars({("secrets", "backends_order"):
> > > > > > >> > > > "metastore,environment_variable,unsupported"})    def
> > > > > > >> > > > test_backends_order_unsupported(self):        with
> > > > > > >> > > > pytest.raises(AirflowConfigException):
> > > > > > >> > > ensure_secrets_loaded()*
> > > > > > >> > > >
> > > > > > >> > > > I don’t fully understand the motivation behind
> supporting
> > > this
> > > > > > >> level of
> > > > > > >> > > > override, especially since it
> > > > > > >> > > > could allow unsupported or unintended configurations.
> > > > > > Additionally,
> > > > > > >> > with
> > > > > > >> > > > Airflow 3.0+, we already support
> > > > > > >> > > > a multi layered secret backend resolution capability
> with
> > > the
> > > > > > >> > > introduction
> > > > > > >> > > > of secrets backend for workers.
> > > > > > >> > > > Order goes as:
> > > > > > >> > > >
> > > > > > >> > > > *secrets backend on worker directly (optional) > env
> vars
> > on
> > > > > > worker
> > > > > > >> > *
> > > > > > >> > > > *reach out to api server [secrets backend defined here
> > > > > (optional)
> > > > > > >
> > > > > > >> env
> > > > > > >> > > > vars on api server > metadata DB].*
> > > > > > >> > > >
> > > > > > >> > > > You will have to consider this angle too.
> > > > > > >> > > >
> > > > > > >> > > > In my opinion, a more practical and realistic use case
> > would
> > > > be
> > > > > to
> > > > > > >> have
> > > > > > >> > > the
> > > > > > >> > > > ability to define multiple custom backends
> > > > > > >> > > > both on worker or the API server.
> > > > > > >> > > >
> > > > > > >> > > > Looking forward to hearing more from you.
> > > > > > >> > > >
> > > > > > >> > > > Thanks & Regards,
> > > > > > >> > > > Amogh Desai
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > On Wed, Jul 2, 2025 at 3:59 PM Anton Nitochkin <
> > > > > > >> > ant.nitoch...@gmail.com>
> > > > > > >> > > > wrote:
> > > > > > >> > > >
> > > > > > >> > > >> Hello,
> > > > > > >> > > >>
> > > > > > >> > > >> I'd like to discuss a new option that can be added via
> > this
> > > > PR:
> > > > > > >> > > >> https://github.com/apache/airflow/pull/45931.
> > > > > > >> > > >>
> > > > > > >> > > >> Recently, I asked developers in Slack for their
> thoughts
> > on
> > > > the
> > > > > > new
> > > > > > >> > > >> variable [secrets]backend_order. Long story short: this
> > > > option
> > > > > > will
> > > > > > >> > > >> introduce the ability to configure the backend order
> and
> > > > > control
> > > > > > it
> > > > > > >> > > using
> > > > > > >> > > >> this variable. The default value will remain the same
> as
> > in
> > > > the
> > > > > > >> > current
> > > > > > >> > > >> version, so for users who don't need it, things will
> stay
> > > as
> > > > > they
> > > > > > >> are
> > > > > > >> > > now.
> > > > > > >> > > >>
> > > > > > >> > > >> Jarek Potiuk advised starting a conversation and
> > discussing
> > > > the
> > > > > > PR
> > > > > > >> to
> > > > > > >> > > reach
> > > > > > >> > > >> a consensus with the community.
> > > > > > >> > > >>
> > > > > > >> > > >> Can you please share your thoughts on the option and
> its
> > > > > > >> > implementation?
> > > > > > >> > > >>
> > > > > > >> > > >> Anton Nitochkin
> > > > > > >> > > >>
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > >
> > ---------------------------------------------------------------------
> > > > > > >> > > To unsubscribe, e-mail:
> dev-unsubscr...@airflow.apache.org
> > > > > > >> > > For additional commands, e-mail:
> > dev-h...@airflow.apache.org
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to