Hi all,
Thanks for the questions and feedback. Below is a consolidated summary of
the questions raised so far, along with clarifications.
------------------------------
1. How do long-running features fit into this faster cadence?
*Short answer:*
Long-running or incomplete features should start behind a feature flag or
be marked experimental/hidden until they are ready. Minor releases should
not accidentally expose unfinished work.
*Details:*
-
Spark already uses feature flags and experimental annotations; this SPIP
formalizes that we should use them more consistently.
-
Experimental/hidden APIs may evolve across minor releases although that
is still discouraged.
Public APIs still follow the existing compatibility expectations - this
SPIP does not change Spark’s API guarantees.
-
A long-running feature can land incrementally on master and the latest .x
branch as long as it is fully off by default and not user-visible.
-
Once the feature is ready, it can be enabled or made public in a minor
or major release depending on impact.
This ensures we don’t block development while also keeping minor releases
safe.
------------------------------
2. What exactly are the new .x branches? Do master and branch-4.x contain
the same code?
*Short answer:*
.x branches (e.g., branch-4.x, branch-5.x) are long-lived stable
major-version branches used to produce all minor releases of that major
line.
master branch remains the development branch for the next major version.
*Branching model under this SPIP:*
-
*master:* development for the *next major release* (e.g., Spark 5.0)
-
*branch-4.x:* produces 4.2, 4.3, 4.4, 4.5 (LTS), etc.
-
*branch-5.x:* future line producing 5.1, 5.2, 5.3 (LTS), etc.
*Code flow:*
-
Most non-behavior changes can merge to *both master and the latest .x
branch*.
-
.x branches intentionally lag behind master on purpose - master will
accumulate major-only changes (dependency bumps, refactoring, Java/Scala
upgrades, etc.).
-
Minor releases are cut from .x, major releases from master.
Yes, this requires small CI and branch adjustments, but I expect that it
won't be so difficult.
------------------------------
3. Are dependency upgrades prohibited on long-lived branches?
Yes - that is a core part of making minor releases predictable and safe.
-
Dependency upgrades only happen on master (and therefore only in major
releases).
-
Minor releases from branch-4.x or branch-5.x should freeze dependencies
for the entire major line.
-
Exceptions: *critical CVEs* or unavoidable security issues, handled case
by case via dev discussion.
This is consistent with what several large projects do to stabilize minor
releases that I investigated.
------------------------------
4. Should Java/Python versions also be frozen for a major line?
*Policy-wise:*
Yes, the runtime compatibility of each major line should stay fixed unless
a breaking change is unavoidable.
But:
-
The tooling used in CI does not need to freeze at an exact version
number.
-
We can continue testing against the latest patch versions of Java
17/Python 3.11 as long as:
-
Spark maintains compatibility with the baseline version stated in the
release docs.
-
We do not require a new Java/Python minor version within the same
major line.
So:
-
Runtime baseline: fixed for the major line
-
CI patch versions: free to update
------------------------------
5. What about the impact on current releases (4.1.0, 4.2.0, existing LTS
commitments)?
Absolutely agreed - this SPIP does not retroactively modify any existing
commitments.
-
Spark 3.5.x and 4.1.x lifecycles remain unchanged under the current
policy.
-
Spark 4.2.x must also follow the current policy because development has
already progressed under that assumption.
-
The release this SPIP could apply to is probably 4.3.0, as Dongjoon
suggested.
The SPIP only affects future release lines.
------------------------------
6. How do we avoid merge conflicts if most PRs go to both master and .x?
This is already an issue today, but the proposal formalizes the following:
-
Behavioral changes should generally go to master only, or be merged into
.x with the feature flag off.
-
Non-behavior changes (docs, tests, small improvements) go to both master
and .x.
-
Dependency upgrades and refactoring touch only master: reduces conflict
surface.
This should reduce the number of conflicts relative to today, not increase
them.
------------------------------
7. Could we avoid .x branches entirely and keep cutting from master?
It has two practical issues:
1.
Master needs a long window to land breaking changes (e.g., Java/Scala
upgrades).
A limited 3-month “break window” is often not enough for major ecosystem
work.
2.
It would be harder to keep the promises on those policies this SPIP
defines.
The .x branches solve both by cleanly separating:
-
fast, safe minors
-
large breaking changes toward next major
So I think the current branching proposal remains preferable.
------------------------------
On Tue, 9 Dec 2025 at 13:19, Dongjoon Hyun <[email protected]> wrote:
> This is a good discussion. Thank you, Hyukjin.
>
> To prevent any negative confusion regarding the existing and ongoing
> releases (including the current votes on RC3, RC4, and future RCs of Spark
> 4.1.0), I want to clarify three simple principles that we must uphold:
>
> 1. This discussion must not impact any existing community release
> commitments. For example, version 3.5.0 was released on September 13th,
> 2023, and will be maintained for 31 months until April 12th, 2026.
>
> 2. Similarly, this discussion should not affect the lifecycle of Apache
> Spark 4.1.0, which has already reached RC2 status. This means Apache Spark
> 4.1.0 must be released strictly under the existing versioning policy. In
> short, Apache Spark 4.1.0 will be maintained for 18 months.
>
> 3. For Apache Spark 4.2.0, any merged commit on the main branch should not
> be reverted due to this SPIP. For instance, Apache Spark 4.2.0 already
> includes Scala 2.13.18.
> [SPARK-54645][BUILD] Upgrade Scala to 2.13.18
>
> I believe 4.3.0 could be the first candidate version for this SPIP.
>
> Thank you,
> Dongjoon Hyun as the release manager of Apache Spark 4.2.0.
>
>
> On Mon, Dec 8, 2025 at 7:27 PM Yang Jie <[email protected]> wrote:
>
>> On the other hand, regarding long-term maintenance branches, such as
>> branch-4.x, under the new version release policy, is it necessary to fix
>> the versions of Java/Python? For instance, Java `17.0.16` or Python 3.11.9.
>> Previously, even for branch-4.1, GitHub Actions tested the latest versions
>> of Java 17 and Python 3.11 rather than being fixed to a specific version.
>>
>> On 2025/12/09 03:13:58 Yang Jie wrote:
>> > So, under the new version release policy, in principle, long-term
>> maintenance branches, such as branch-4.x, are not allowed to have
>> dependency changes. For example, if the Scala version 2.13.17 is used when
>> branch-4.x is created, then throughout the entire lifecycle of branch-4.x,
>> it should not, in principle, be updated to a newer Scala version. Is my
>> understanding correct?
>> >
>> > On 2025/12/09 01:30:43 Wenchen Fan wrote:
>> > > I like this idea, but we should call out the impact to Spark
>> development.
>> > > IIUC, we need the following changes:
>> > >
>> > > - If a feature is known to take a long time to complete, we should
>> add a
>> > > feature flag at the beginning, to avoid releasing a half-baked
>> feature with
>> > > this faster relase cadence.
>> > > - We have defined behavior change in
>> > > https://spark.apache.org/contributing.html. According to it, new
>> APIs,
>> > > new features, bug fixes, etc. are all behavior changes because
>> they are
>> > > user-visible. We should be more clear about what can go to minor
>> releases.
>> > > I think we can follow the same standard for writing migration
>> guide: if the
>> > > behavior change needs user action.
>> > > - We will merge most PRs to both master and a new .x branch, and we
>> > > should avoid merge conflicts as possible as we can. For example,
>> behavior
>> > > changes should also be merged into the .x branch, with flag off.
>> > >
>> > > An alternative idea is to still cut next release from the master
>> branch,
>> > > but we only allow a 3-month window per year (right before the next
>> major
>> > > release) to make breaking changes in the master branch, such as
>> dependency
>> > > upgrade, Java/Scala version upgrade, etc. Of cource we can still have
>> > > exceptions with voting. The drawback is that sometimes 3 months is not
>> > > sufficient to make major upgrades, and then the .x branch will be
>> useful.
>> > >
>> > > On Tue, Dec 9, 2025 at 9:02 AM L. C. Hsieh <[email protected]> wrote:
>> > >
>> > > > I see.
>> > > >
>> > > > In the "Code Merging Principle" section,
>> > > > > For non-behavior changes, always merge to master and the latest .x
>> > > > branch. This change will be released with the next minor release
>> > > >
>> > > > Is this .x branch meaning a branch of a major branch like
>> branch-4.x?
>> > > >
>> > > > Also, looks like master and the latest .x branch basically have the
>> > > > same codebase?
>> > > >
>> > > > On Mon, Dec 8, 2025 at 4:37 PM Hyukjin Kwon <[email protected]>
>> wrote:
>> > > > >
>> > > > > I actually intentionally disabled the commenter access so the
>> discussion
>> > > > can happen here :-). Otherwise, we would end up with multiple
>> places to
>> > > > discuss this.
>> > > > >
>> > > > > On Tue, 9 Dec 2025 at 09:33, L. C. Hsieh <[email protected]>
>> wrote:
>> > > > >>
>> > > > >> Can you open comment access to the google doc?
>> > > > >> So it will be easier to ask questions directly on the SPIP doc.
>> > > > >>
>> > > > >> On Sun, Dec 7, 2025 at 1:53 PM Hyukjin Kwon <
>> [email protected]>
>> > > > wrote:
>> > > > >> >
>> > > > >> > Hi all,
>> > > > >> >
>> > > > >> > I would like to start a discussion on accelerating the Apache
>> Spark
>> > > > release cadence. Over the past four months, we have been running
>> preview
>> > > > releases, and the process has been smooth and effective. As
>> mentioned in
>> > > > the preview release discussion thread, I’d now like to extend this
>> approach
>> > > > to official releases.
>> > > > >> >
>> > > > >> > During this period, I also looked into how other large
>> projects, such
>> > > > as Kubernetes and Python, manage their release timelines. Based on
>> that
>> > > > research and our own recent experience, I’ve drafted a proposal for
>> an
>> > > > updated Apache Spark release plan.
>> > > > >> >
>> > > > >> > TL;DR:
>> > > > >> >
>> > > > >> > Introduce a predictable release schedule: annual major
>> releases and
>> > > > quarterly minor releases, so users can benefit from new features
>> earlier.
>> > > > >> > With a faster cadence for minor releases, we should take a more
>> > > > conservative approach toward behavior changes in minor versions,
>> while
>> > > > still allowing new features and improvements.
>> > > > >> >
>> > > > >> > I’d love to hear your thoughts and feedback.
>> > > > >> >
>> > > > >> > More details can be found in SPIP: Accelerating Apache Spark
>> Release
>> > > > Cadence
>> > > > >> >
>> > > > >> > Thanks!
>> > > > >>
>> > > > >>
>> ---------------------------------------------------------------------
>> > > > >> To unsubscribe e-mail: [email protected]
>> > > > >>
>> > > >
>> > > >
>> ---------------------------------------------------------------------
>> > > > To unsubscribe e-mail: [email protected]
>> > > >
>> > > >
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe e-mail: [email protected]
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: [email protected]
>>
>>