I love this proposal, that would indeed simplify many things and makes things 
more "clean". +1

On 2025/12/25 17:28:56 Jens Scheffler wrote:
> Hi Jason,
> 
> thanks for raising the discussion. +1 also from me.
> 
> Also we always had the idea to move the widget definition from hook 
> forms into some descriptive structure or store them in DB ... which 
> would replace the rather ugly mocks that are preventing to load all the 
> form UI widgets from the past. But irrespective of this, a CLI 
> performance improvement would be beneficial.
> 
> Jens
> 
> On 12/25/25 17:05, Jarek Potiuk wrote:
> > I am all for it.
> >
> > There were earlier concerns about performance of provider's discovery, but
> > I think that provider's discovery alone is "fast enough". Initially when
> > ProvidersManager was introduced, it discovered everything including
> > importing Hooks and finding out the connection definition - with Widgets
> > and everything related. Also our circular import complexity of
> > settings/configuration - things partially imported when airflow help was
> > being loaded and commands were initialized and imported made it very
> > brittle - one innocent import added here or there caused cyclic imports.
> > However, since then, a lot changed:
> >
> > * Providers Manager discovery is very much optimized and lazy-loads
> > whatever is needed - only when it is needed (so connections are no more
> > imported when ProvidersManager is initialized)
> > * For Airflow 3 we've introduced mocking of the Widget classes, so we don't
> > even import all flask module hierarchy even if connections are not used
> > from providers
> > * We are close to finishing separation of "shared" libraries as part of
> > task isolation, that we are working on now is done (we pay a lot of
> > attention to it with Amogh and others.  This includes adding prek hooks
> > that will guard proper imports (WP -
> > https://github.com/apache/airflow/pull/58825).
> >
> > So I hope also eventually - gains in those benchmark results will be even
> > more impressive - but even now, your results show that it's better to do
> > discovery like we do now.
> >
> > Also - this was raised quite a few times that not seeing "celery" and
> > "kubernetes" commands when you have no executor configured is misleading,
> > people expect that the commands will be visible when you "install"
> > provider, not only when you "configure executor" - which was a big
> > limitation so far and we had at least few related issues about it. It's not
> > intuitive at all.
> >
> > Plus - it solves one more problem, currently some Kubernetes CLI commands
> > (generate-dag-yaml, cleanup-pods) are useful even without Kubernetes*
> > family of executors.
> >
> > So ... Big +1 from me.
> >
> > J.
> >
> >
> >
> > On Thu, Dec 25, 2025 at 3:50 PM Zhe-You(Jason) Liu <[email protected]>
> > wrote:
> >
> >> Hi all,
> >>
> >> First of all, I’d like to wish everyone a Merry Christmas and a happy
> >> holiday season 🎄!
> >>
> >> I’d like to start a discussion about introducing a new `cli` section in
> >> provider metadata, with the goals of:
> >>
> >>     1.
> >>
> >>     **Improving Airflow CLI startup and response time**
> >>     According to the PoC benchmark, this change provides a noticeable
> >>     performance improvement.
> >>     2.
> >>
> >>     **Unlocking the ability for all providers to expose commands in the
> >>     Airflow CLI**
> >>     Currently, only AuthManger and Executor can expose commands.
> >>
> >> This change is probably not large enough to justify a full AIP, so I
> >> believe a discussion followed by lazy consensus should be sufficient.
> >> ------------------------------
> >> Why
> >>
> >> Before the recent refactor, regardless of which `airflow` command is
> >> executed, `cli_parser` imports the actual AuthManager and Executor in use
> >> in order to call `get_cli_commands` and collect optional CLI commands [1].
> >>
> >> This means that **every** CLI invocation — including something as simple as
> >> `airflow --help` — will import heavy modules such as `kubernetes`, `
> >> flask_appbuilder`, etc., depending on the values of `
> >> AIRFLOW__CORE__AUTH_MANAGER` and `AIRFLOW__CORE__EXECUTOR`.
> >>
> >> In the worst case (e.g. `FabAuthManager` + `CeleryKubernetesExecutor`), it
> >> takes **~5 seconds** just to display `airflow --help` based on the
> >> benchmark results.
> >> ------------------------------
> >> How
> >>
> >> The refactor includes:
> >>
> >>     1.
> >>
> >>     Adding a `cli` section to provider metadata (`provider.yaml` / `def
> >>     get_provider_info`) that points to `get_cli_commands`
> >>     2.
> >>
> >>     Moving `get_cli_commands` into a **clean** module that does not import
> >>     any heavy dependencies
> >>     -
> >>
> >>        It should only import from `airflow.cli.cli_config`
> >>        -
> >>
> >>        It should rely on `lazy_load_command`
> >>
> >> ------------------------------
> >> What
> >>
> >> The main behavioral change is that, after this refactor, **any installed
> >> provider that exposes CLI commands will have those commands available in
> >> the Airflow CLI**, even if it is not configured as the active AuthManager
> >> or Executor.
> >>
> >> For example:
> >>
> >>     -
> >>
> >>     If both the Celery and Kubernetes providers are installed
> >>     -
> >>
> >>     And `AIRFLOW__CORE__EXECUTOR=LocalExecutor`
> >>
> >> The Celery and Kubernetes command groups will still appear in `airflow
> >> --help`.
> >>
> >> If there are no strong drawbacks to introducing the cli section in provider
> >> metadata, I can either:
> >>
> >>     -
> >>
> >>     Break the change down provider by provider, or
> >>     -
> >>
> >>     Submit one larger atomic change covering all providers
> >>
> >> ------------------------------
> >> PoC & Summary
> >>
> >> I’ve completed a PoC [2] along with a benchmark script [3] and result [4].
> >> Below is a summary of the CLI response time improvements:
> >>
> >>     -
> >>
> >>     *Overall average*: 3.117s, down from 4.048s (*~23.0% improvement*)
> >>     -
> >>
> >>     *Fastest run*: 3.092s, down from 3.566s (*~13.3% improvement*)
> >>     -
> >>
> >>     *Slowest run*: 3.155s, down from 5.006s (*~37.0% improvement*)
> >>
> >> ------------------------------
> >>
> >> *References*
> >>
> >> [1]
> >>
> >> https://github.com/apache/airflow/blob/ac085a425652d16b5fff17f8e937938c7d47b868/airflow-core/src/airflow/cli/cli_parser.py#L62-L86
> >> [2] https://github.com/apache/airflow/pull/59805
> >> [3]
> >>
> >> https://github.com/apache/airflow/pull/59805/changes#diff-7ef81cd1589183b63d2452f64809adcba5f6b14e5ee337be02524d83ad6698e4
> >> [4] https://github.com/apache/airflow/pull/59805#benchmark-result
> >>
> >> I’d really appreciate any feedback, suggestions, or concerns about this
> >> approach. Thanks!
> >>
> >> Best regards,
> >> Jason
> >>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to