Hi Bugra,

Thanks for raising the Kustomize discussion. I haven't gone through the doc
thoroughly yet, but just FYI, here is some context I have regarding the
Kustomize approach. This might be helpful in coming up with a final
structure that fits all the use cases we need to support while ensuring
good long-term maintenance.

For example:
- Add optional OTel service to the Airflow Helm Chart #64902 [1]
- Helm chart support for periodic API server rollout restarts on Kubernetes
#61432 [2]

Additionally, there is a Slack thread discussing the Kustomize direction
[3].

[1] https://github.com/apache/airflow/pull/64902#issuecomment-4206639363
[2] https://github.com/apache/airflow/pull/61636#issuecomment-3881992323
[3] https://apache-airflow.slack.com/archives/C027H098M1C/p1770794021001679

Best,
Jason

On Mon, Apr 27, 2026 at 8:16 AM Buğra Öztürk <[email protected]>
wrote:

>  Hi all,
>
> I have started working on the PoC for the Kustomize direction as mentioned
> in the thread for KEDA.
>
> Here is what I am thinking for the approach to make this stable and faster
> for further iterations. It is to align with the fundamentals before
> building further. Smaller increments should make reviews easier and allow
> for quicker course correction. Once the foundation is in place, the
> remaining work should move faster.
>
> * Share the directory structure in this first PoC example (not fully tested
> yet), with CI/pre-commit checks focusing only on validating the agreed
> structure
>
> * Collect feedback, review, and merge the shared PR
>
> * Propose and build a smoke test on top of the KEDA overlay in a separate
> new PR
>
> *  Collect feedback, review, and merge the smoke test PR
>
> * Test locally to check if smoke tests match
>
> * Move KEDA overlay to testing in a new PR with the introduction of a
> deprecation warning
>
> PR: https://github.com/apache/airflow/pull/65897
>
> Thoughts and early feedback very welcome.
>
> Are we going to go over these in every overlay addition?
> Short answer, no.
> Long answer, this is early maturity frictions and making step-by-step will
> make new overlay additions without too much hassle. I hope that an agreed,
> tested, documented approach will make the next additions in one go in a
> single PR :)
>
>
> Kind regards,
> Bugra Ozturk
>
> On Sat, Apr 25, 2026 at 5:37 PM Buğra Öztürk <[email protected]>
> wrote:
>
> > Sorry for the formatting of the directory structure! In the mail app, it
> > looked fine. You can find that specifically in Google Docs as well
> >
> >
> https://docs.google.com/document/d/1bZsyrG5kjsYd2rJRiN3kR613lO6JPEBd4ItsySneOMw/edit?tab=t.cv476feyrxmf
> >
> > On Sat, Apr 25, 2026 at 5:31 PM Buğra Öztürk <[email protected]>
> > wrote:
> >
> >> Hi all,
> >>
> >> We have been working through a Helm chart refurbishment effort over the
> >> past few months. The goal is to keep 1.2x stable for existing users
> while
> >> preparing a cleaner next major release. I would like to share where we
> have
> >> landed and open it up for feedback before going further.
> >>
> >> *Branching strategy*
> >>
> >> We created chart/v1-2x-test, mirroring how v3-1-test works for Airflow
> >> itself.
> >>
> >>    -
> >>
> >>    chart/v1-2x-test is the maintenance line. Bug fixes and stability
> >>    work for 1.2x releases land here.
> >>    -
> >>
> >>    main is for cleanup, deprecations, and preparation toward 2.0.
> >>
> >> The split was deliberate. We wanted to give existing 1.x users a smooth
> >> transition path without holding back the 2.0 work, and the same the
> other
> >> way around. 2.0 is intended as a real refurbish rather than an
> incremental
> >> bump. It will carry a fair number of breaking changes, but the upside is
> >> that it gives users a clean starting point with a chart fully designed
> >> around Airflow 3 and what comes after, instead of one carrying years of
> >> accumulated assumptions from the 1.x line. Existing users on 1.2x are
> not
> >> forced into the move, which the maintenance branch is keeping shipping
> for
> >> them, but anyone starting fresh or willing to migrate gets a much
> simpler
> >> chart to work with.
> >>
> >> We have already cut and released 1.21.0 from chart/v1-2x-test, so the
> >> model is in place rather than hypothetical. The release went through
> >> cleanly and gave us the separation we were after, which is part of the
> >> reason the proposal feels concrete enough to bring here.
> >>
> >> *Kustomize direction*
> >>
> >> A recurring theme in our discussions has been that the chart carries a
> >> fair amount of components that are not Airflow-native. Kerberos,
> >> Elasticsearch logging, gitSync, and PostgreSQL are good examples. They
> make
> >> the chart heavier than it needs to be and pull us toward maintaining
> things
> >> that already have external owners.
> >>
> >> The proposal is to express these as Kustomize overlays that sit
> alongside
> >> the chart as a guide for users, not as released chart artifacts.
> >>
> >> *Confirmed for Kustomize*
> >>
> >>    -
> >>
> >>    Kerberos: Authentication variant, environment-specific, sidecar
> >>    injection
> >>    -
> >>
> >>    gitSync: DAG delivery mechanism, orthogonal to Airflow runtime
> >>    -
> >>
> >>    Elasticsearch: External logging backend, not Airflow-native
> >>    -
> >>
> >>    PostgreSQL: Can be expressed as plain Kubernetes resources
> >>
> >> PgBouncer and StatsD are also candidates but we want to investigate them
> >> further before committing. They will not be in the first round of
> overlays.
> >>
> >> *Structure*
> >>
> >> Overlays live in the repository but are not part of the chart release
> >> artifact. Each overlay has a kustomization.yaml, the resources it
> produces,
> >> and a STATUS file marking whether it is verified in CI or a starting
> point
> >> that users can extend.
> >>
> >> A rough sketch of how it would look in the repo:
> >>
> >>
> >>
> >>  ```
> >>   chart/
> >>
> >>
> >>
> >>     kustomize-overlays/
> >>
> >>
> >>       README.rst
> >>       CONTRIBUTING.rst
> >>       keda/
> >>
> >>
> >>
> >>         kustomization.yaml
> >>         scaledobject.yaml
> >>
> >>
> >>
> >>         STATUS
> >>
> >>
> >>
> >>       kerberos/
> >>         kustomization.yaml
> >>         scheduler-sidecar-patch.yaml
> >>         STATUS
> >>
> >> ```
> >>
> >>
> >>
> >> We will start with a PoC before agreeing on the broader rollout. HPA or
> >> KEDA covers the standalone addition pattern to go first or second.
> Kerberos
> >> covers the post-render patch pattern and becomes the template for any
> >> future sidecar injection use case. We are putting together a first PoC
> now
> >> and will share it in this thread once it is in a shape worth looking
> at, so
> >> the discussion has something concrete to sit alongside the criteria
> below.
> >>
> >> *Lifecycle*
> >>
> >> The lifecycle mirrors how providers work, just on a smaller scale.
> >>
> >>    -
> >>
> >>    A new overlay is proposed via a PR and lands with STATUS: not-tested.
> >>    -
> >>
> >>    The contributor follows up with a test at
> >>    chart/tests/kustomize/test_.py and flips STATUS to tested, either in
> >>    the same PR or a focused follow-up. Equally, there can be smoke test
> >>    on CI to test the flow of Kustomize overlays, which can be a
> technical
> >>    detail of the process flow.
> >>    -
> >>
> >>    An overlay is deprecated by setting deprecated: true in STATUS along
> >>    with a short message pointing to the replacement.
> >>    -
> >>
> >>    Deprecated overlays stay around for one major chart version before
> >>    they are removed, so users always have a window to migrate.
> >>
> >> CONTRIBUTING.rst in the overlays directory is the authoritative
> reference
> >> for all of this, criteria, the exception process, status conventions,
> and
> >> the migration guide pattern live there together.
> >>
> >> *Criteria for chart vs Kustomize*
> >>
> >> The criteria will live at chart/kustomize-overlays/CONTRIBUTING.rst.
> >>
> >> Belongs in the chart (all must be true):
> >>
> >>    -
> >>
> >>    Required to run Airflow (scheduler, API server, dag-processor,
> >>    triggerer, workers)
> >>    -
> >>
> >>    Removing it requires changes to Airflow's own configuration
> >>    -
> >>
> >>    No external owner
> >>
> >> Belongs in Kustomize (any may be true):
> >>
> >>    -
> >>
> >>    Can be expressed as a standalone Kubernetes resource without
> >>    modifying chart-rendered resources
> >>    -
> >>
> >>    Environment-specific (authentication schemes, logging backends,
> >>    autoscaling controllers)
> >>    -
> >>
> >>    Has an external owner (KEDA, Elasticsearch, any PostgreSQL
> >>    distribution)
> >>    -
> >>
> >>    Requires CRDs that the chart does not install
> >>
> >> One invariant we want to keep is that the chart never removes a
> component
> >> without a working overlay already in place. Users should always have a
> >> migration path before anything disappears.
> >>
> >> *Thoughts welcome*
> >>
> >> The branching split is in place because we wanted the transition to 2.0
> >> to be smooth for users, with 1.2x continuing to ship in parallel.
> Sharing
> >> it here so the rest of the proposal sits in the right context.
> >>
> >> What I would love to hear thoughts on:
> >>
> >>    -
> >>
> >>    Does the chart vs Kustomize criteria hold up against the deployments
> >>    you have run? Anything that feels off, missing, or too strict.
> >>    -
> >>
> >>    Anything in the confirmed component list you would push back on, or
> >>    anything you think should be added.
> >>
> >> If you would rather leave longer notes on the Confluence page or the
> >> Google Doc we have been working from, those are equally welcome. Links
> >> below.
> >>
> >> *References*
> >>
> >>    -
> >>
> >>    Confluence:
> >>    https://cwiki.apache.org/confluence/display/AIRFLOW/Helm+Refurbish
> >>    -
> >>
> >>    Discussion notes (Google Doc):
> >>
> https://docs.google.com/document/d/1bZsyrG5kjsYd2rJRiN3kR613lO6JPEBd4ItsySneOMw/edit?usp=sharing
> >>    -
> >>
> >>    Umbrella issue: https://github.com/apache/airflow/issues/64037
> >>
> >> Thanks,
> >>
> >> Bugra Ozturk
> >>
> >> Kind regards,
> >>
> >
>

Reply via email to