Just quick headups. I don't intend to rush on this. I'll leave this discussion open till late next week so please let me know if you guys have any concerns or feedback.
On Wed, Dec 10, 2025 at 6:51 AM Hyukjin Kwon <[email protected]> wrote: > Hi all, > > Thanks for the questions and feedback. Below is a consolidated summary of > the questions raised so far, along with clarifications. > > ------------------------------ > 1. How do long-running features fit into this faster cadence? > > *Short answer:* > Long-running or incomplete features should start behind a feature flag or > be marked experimental/hidden until they are ready. Minor releases should > not accidentally expose unfinished work. > > *Details:* > > - > > Spark already uses feature flags and experimental annotations; this > SPIP formalizes that we should use them more consistently. > - > > Experimental/hidden APIs may evolve across minor releases although > that is still discouraged. > Public APIs still follow the existing compatibility expectations - > this SPIP does not change Spark’s API guarantees. > - > > A long-running feature can land incrementally on master and the latest > .x branch as long as it is fully off by default and not user-visible. > - > > Once the feature is ready, it can be enabled or made public in a minor > or major release depending on impact. > > This ensures we don’t block development while also keeping minor releases > safe. > ------------------------------ > 2. What exactly are the new .x branches? Do master and branch-4.x contain > the same code? > > *Short answer:* > .x branches (e.g., branch-4.x, branch-5.x) are long-lived stable > major-version branches used to produce all minor releases of that major > line. > master branch remains the development branch for the next major version. > > *Branching model under this SPIP:* > > - > > *master:* development for the *next major release* (e.g., Spark 5.0) > - > > *branch-4.x:* produces 4.2, 4.3, 4.4, 4.5 (LTS), etc. > - > > *branch-5.x:* future line producing 5.1, 5.2, 5.3 (LTS), etc. > > *Code flow:* > > - > > Most non-behavior changes can merge to *both master and the latest .x > branch*. > - > > .x branches intentionally lag behind master on purpose - master will > accumulate major-only changes (dependency bumps, refactoring, Java/Scala > upgrades, etc.). > - > > Minor releases are cut from .x, major releases from master. > > Yes, this requires small CI and branch adjustments, but I expect that it > won't be so difficult. > ------------------------------ > 3. Are dependency upgrades prohibited on long-lived branches? > > Yes - that is a core part of making minor releases predictable and safe. > > - > > Dependency upgrades only happen on master (and therefore only in major > releases). > - > > Minor releases from branch-4.x or branch-5.x should freeze > dependencies for the entire major line. > - > > Exceptions: *critical CVEs* or unavoidable security issues, handled > case by case via dev discussion. > > This is consistent with what several large projects do to stabilize minor > releases that I investigated. > ------------------------------ > 4. Should Java/Python versions also be frozen for a major line? > > *Policy-wise:* > Yes, the runtime compatibility of each major line should stay fixed unless > a breaking change is unavoidable. > > But: > > - > > The tooling used in CI does not need to freeze at an exact version > number. > - > > We can continue testing against the latest patch versions of Java > 17/Python 3.11 as long as: > - > > Spark maintains compatibility with the baseline version stated in > the release docs. > - > > We do not require a new Java/Python minor version within the same > major line. > > So: > > - > > Runtime baseline: fixed for the major line > - > > CI patch versions: free to update > > ------------------------------ > 5. What about the impact on current releases (4.1.0, 4.2.0, existing LTS > commitments)? > > Absolutely agreed - this SPIP does not retroactively modify any existing > commitments. > > - > > Spark 3.5.x and 4.1.x lifecycles remain unchanged under the current > policy. > - > > Spark 4.2.x must also follow the current policy because development > has already progressed under that assumption. > - > > The release this SPIP could apply to is probably 4.3.0, as Dongjoon > suggested. > > The SPIP only affects future release lines. > ------------------------------ > 6. How do we avoid merge conflicts if most PRs go to both master and .x? > > This is already an issue today, but the proposal formalizes the following: > > - > > Behavioral changes should generally go to master only, or be merged > into .x with the feature flag off. > - > > Non-behavior changes (docs, tests, small improvements) go to both > master and .x. > - > > Dependency upgrades and refactoring touch only master: reduces > conflict surface. > > This should reduce the number of conflicts relative to today, not increase > them. > ------------------------------ > 7. Could we avoid .x branches entirely and keep cutting from master? > > It has two practical issues: > > 1. > > Master needs a long window to land breaking changes (e.g., Java/Scala > upgrades). > A limited 3-month “break window” is often not enough for major > ecosystem work. > 2. > > It would be harder to keep the promises on those policies this SPIP > defines. > > The .x branches solve both by cleanly separating: > > - > > fast, safe minors > - > > large breaking changes toward next major > > So I think the current branching proposal remains preferable. > ------------------------------ > > > > On Tue, 9 Dec 2025 at 13:19, Dongjoon Hyun <[email protected]> > wrote: > >> This is a good discussion. Thank you, Hyukjin. >> >> To prevent any negative confusion regarding the existing and ongoing >> releases (including the current votes on RC3, RC4, and future RCs of Spark >> 4.1.0), I want to clarify three simple principles that we must uphold: >> >> 1. This discussion must not impact any existing community release >> commitments. For example, version 3.5.0 was released on September 13th, >> 2023, and will be maintained for 31 months until April 12th, 2026. >> >> 2. Similarly, this discussion should not affect the lifecycle of Apache >> Spark 4.1.0, which has already reached RC2 status. This means Apache Spark >> 4.1.0 must be released strictly under the existing versioning policy. In >> short, Apache Spark 4.1.0 will be maintained for 18 months. >> >> 3. For Apache Spark 4.2.0, any merged commit on the main branch should >> not be reverted due to this SPIP. For instance, Apache Spark 4.2.0 already >> includes Scala 2.13.18. >> [SPARK-54645][BUILD] Upgrade Scala to 2.13.18 >> >> I believe 4.3.0 could be the first candidate version for this SPIP. >> >> Thank you, >> Dongjoon Hyun as the release manager of Apache Spark 4.2.0. >> >> >> On Mon, Dec 8, 2025 at 7:27 PM Yang Jie <[email protected]> wrote: >> >>> On the other hand, regarding long-term maintenance branches, such as >>> branch-4.x, under the new version release policy, is it necessary to fix >>> the versions of Java/Python? For instance, Java `17.0.16` or Python 3.11.9. >>> Previously, even for branch-4.1, GitHub Actions tested the latest versions >>> of Java 17 and Python 3.11 rather than being fixed to a specific version. >>> >>> On 2025/12/09 03:13:58 Yang Jie wrote: >>> > So, under the new version release policy, in principle, long-term >>> maintenance branches, such as branch-4.x, are not allowed to have >>> dependency changes. For example, if the Scala version 2.13.17 is used when >>> branch-4.x is created, then throughout the entire lifecycle of branch-4.x, >>> it should not, in principle, be updated to a newer Scala version. Is my >>> understanding correct? >>> > >>> > On 2025/12/09 01:30:43 Wenchen Fan wrote: >>> > > I like this idea, but we should call out the impact to Spark >>> development. >>> > > IIUC, we need the following changes: >>> > > >>> > > - If a feature is known to take a long time to complete, we >>> should add a >>> > > feature flag at the beginning, to avoid releasing a half-baked >>> feature with >>> > > this faster relase cadence. >>> > > - We have defined behavior change in >>> > > https://spark.apache.org/contributing.html. According to it, new >>> APIs, >>> > > new features, bug fixes, etc. are all behavior changes because >>> they are >>> > > user-visible. We should be more clear about what can go to minor >>> releases. >>> > > I think we can follow the same standard for writing migration >>> guide: if the >>> > > behavior change needs user action. >>> > > - We will merge most PRs to both master and a new .x branch, and >>> we >>> > > should avoid merge conflicts as possible as we can. For example, >>> behavior >>> > > changes should also be merged into the .x branch, with flag off. >>> > > >>> > > An alternative idea is to still cut next release from the master >>> branch, >>> > > but we only allow a 3-month window per year (right before the next >>> major >>> > > release) to make breaking changes in the master branch, such as >>> dependency >>> > > upgrade, Java/Scala version upgrade, etc. Of cource we can still have >>> > > exceptions with voting. The drawback is that sometimes 3 months is >>> not >>> > > sufficient to make major upgrades, and then the .x branch will be >>> useful. >>> > > >>> > > On Tue, Dec 9, 2025 at 9:02 AM L. C. Hsieh <[email protected]> wrote: >>> > > >>> > > > I see. >>> > > > >>> > > > In the "Code Merging Principle" section, >>> > > > > For non-behavior changes, always merge to master and the latest >>> .x >>> > > > branch. This change will be released with the next minor release >>> > > > >>> > > > Is this .x branch meaning a branch of a major branch like >>> branch-4.x? >>> > > > >>> > > > Also, looks like master and the latest .x branch basically have the >>> > > > same codebase? >>> > > > >>> > > > On Mon, Dec 8, 2025 at 4:37 PM Hyukjin Kwon <[email protected]> >>> wrote: >>> > > > > >>> > > > > I actually intentionally disabled the commenter access so the >>> discussion >>> > > > can happen here :-). Otherwise, we would end up with multiple >>> places to >>> > > > discuss this. >>> > > > > >>> > > > > On Tue, 9 Dec 2025 at 09:33, L. C. Hsieh <[email protected]> >>> wrote: >>> > > > >> >>> > > > >> Can you open comment access to the google doc? >>> > > > >> So it will be easier to ask questions directly on the SPIP doc. >>> > > > >> >>> > > > >> On Sun, Dec 7, 2025 at 1:53 PM Hyukjin Kwon < >>> [email protected]> >>> > > > wrote: >>> > > > >> > >>> > > > >> > Hi all, >>> > > > >> > >>> > > > >> > I would like to start a discussion on accelerating the Apache >>> Spark >>> > > > release cadence. Over the past four months, we have been running >>> preview >>> > > > releases, and the process has been smooth and effective. As >>> mentioned in >>> > > > the preview release discussion thread, I’d now like to extend this >>> approach >>> > > > to official releases. >>> > > > >> > >>> > > > >> > During this period, I also looked into how other large >>> projects, such >>> > > > as Kubernetes and Python, manage their release timelines. Based on >>> that >>> > > > research and our own recent experience, I’ve drafted a proposal >>> for an >>> > > > updated Apache Spark release plan. >>> > > > >> > >>> > > > >> > TL;DR: >>> > > > >> > >>> > > > >> > Introduce a predictable release schedule: annual major >>> releases and >>> > > > quarterly minor releases, so users can benefit from new features >>> earlier. >>> > > > >> > With a faster cadence for minor releases, we should take a >>> more >>> > > > conservative approach toward behavior changes in minor versions, >>> while >>> > > > still allowing new features and improvements. >>> > > > >> > >>> > > > >> > I’d love to hear your thoughts and feedback. >>> > > > >> > >>> > > > >> > More details can be found in SPIP: Accelerating Apache Spark >>> Release >>> > > > Cadence >>> > > > >> > >>> > > > >> > Thanks! >>> > > > >> >>> > > > >> >>> --------------------------------------------------------------------- >>> > > > >> To unsubscribe e-mail: [email protected] >>> > > > >> >>> > > > >>> > > > >>> --------------------------------------------------------------------- >>> > > > To unsubscribe e-mail: [email protected] >>> > > > >>> > > > >>> > > >>> > >>> > --------------------------------------------------------------------- >>> > To unsubscribe e-mail: [email protected] >>> > >>> > >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: [email protected] >>> >>>
