Thanks for the proposal. The direction is awesome - we have such a long interval between minor releases and this would help us address some of the issues from the long release cadence.
I'd like to understand a couple of things. 1. Since we "release" the preview, we will go through the VOTE process. What is the expected overhead on doing this monthly? (Maybe this would be coupled with the questions below.) 2. What quality do we foresee on the preview release? Do we expect to have -1 and address them per monthly release, if there is an unresolved correctness issue or regression? Or will we have very loose criteria of verification and leave the preview to early adopters with risks? Is the stability of preview release based on the fact existing unit tests are passing, and we are mostly verifying the signatures of artifacts? 3. Do we allow breaking changes across previews, say, as long as we don't introduce breaking changes in the new minor version it should be fine? 4. Does this drive the change of our development in other directions e.g. dev branch or so? e.g. transformWithState had been marked as private at the phase of development since we knew it would take a lot of time. Or will we allow preview releases to contain incomplete features? How do we prevent users from accessing the incomplete feature, or do we intend to try it out while it's still in development? On Wed, Jul 2, 2025 at 11:25 AM Hyukjin Kwon <gurwls...@apache.org> wrote: > Let me start the vote tmr if we're all good with this :-). > > On Wed, 2 Jul 2025 at 10:12, Anton Okolnychyi <aokolnyc...@gmail.com> > wrote: > >> Having monthly preview releases for Spark is going to be huge for >> projects like Iceberg and Delta. >> >> - Anton >> >> On Tue, Jul 1, 2025 at 5:43 PM Dongjoon Hyun <dongj...@apache.org> wrote: >> >>> Thank you for the clarification, Hyukjin. Also, thank you for sharing >>> your direction, DB. >>> >>> I agree with you folks that the AS-IS scope of SPIP is a good start. >>> >>> +1 for the SPIP because `4.1.0-previewX` itself is actually very helpful >>> already during developing Spark subprojects like "Spark Connect for Swift" >>> and "Spark K8s Operator". :-) >>> >>> Thank you again. >>> >>> Dongjoon. >>> >>> On 2025/07/02 00:31:10 Hyukjin Kwon wrote: >>> > Hi Dongjoon, >>> > >>> > Thanks a lot for your detailed feedback and great questions! >>> > Let me clarify my current proposal and thoughts: >>> > >>> > 1. Regarding Spark 5.0 schedule >>> > At the moment, I don’t have a concrete Spark 5.0 schedule in mind. >>> > I included the stable major releases in the Final Success criteria >>> mainly >>> > to set a practical milestone to complete the automation work and fully >>> > transition to automated official releases. >>> > I don't intend to set the next major release timeline in this SPIP. >>> > >>> > 2. Lowering the bar for preview releases >>> > In short, yeah. I expect the bar for preview releases to be lower >>> compared >>> > to official releases, given that these previews are primarily for early >>> > testing and feedback. >>> > That said, if the community raises concerns during the vote and we end >>> up >>> > with multiple RCs, that’s totally fine. In such cases, we could even >>> skip >>> > the next month's preview if needed. >>> > My intention is not to strictly enforce monthly previews but to provide >>> > regular opportunities for testing, while keeping the process >>> low-pressure >>> > for the community. >>> > >>> > 3. Scope of monthly previews (first minor versions only) >>> > Yup. This proposal is only for previews of the next minor version from >>> the >>> > master branch. For example: 4.1.0-preview1, 4.1.0-preview2, ..., until >>> we >>> > cut the real 4.1.0 release. >>> > Once 4.1.0 is out, previews would move to 4.2.0-preview1, and so on. >>> > There will be no 4.0.1-preview1 style releases under this proposal. >>> > >>> > 4. Official releases (e.g., 4.0.1, 3.5.7) >>> > For now, this SPIP does not target automating or introducing monthly >>> > maintenance releases like 4.0.1. >>> > But yeah, that's my final goal actually. The automated maintenance >>> releases >>> > are where I want to go next, after proving the automation works >>> reliably >>> > via previews. >>> > here are actually some more work to be done to make it actually no >>> manual >>> > step at all. >>> > >>> > >>> > *TL;DR*: this is a step before automating the official releases (it's >>> not >>> > tied to the official releases yet to be conservative) + providing users >>> > with early access to the latest dev Spark build. >>> > >>> > On Wed, 2 Jul 2025 at 09:27, DB Tsai <dbt...@dbtsai.com> wrote: >>> > >>> > > Thank you, Hyukjin for driving the SPIP and for your work on the >>> release >>> > > automation infrastructure — it’s a huge step forward. >>> > > >>> > > I’ve been thinking about this topic quite a bit since the Spark 4.0 >>> > > release. While Spark continues to deliver meaningful improvements in >>> every >>> > > release and enjoys active community contributions, there’s a >>> lingering >>> > > perception that the project is mature but not evolving quickly. I >>> feel this >>> > > perception is largely due to the long gap between major versions — >>> it’s >>> > > been five years between Spark 3.0 and 4.0 — which has understandably >>> caused >>> > > some frustration among both contributors and users. >>> > > >>> > > Now, with release automation and monthly preview builds purposed in >>> this >>> > > SPIP, we have a real opportunity to change that. As Dongjoon >>> suggested, >>> > > setting up a regular maintenance release cadence — perhaps bi-monthly >>> > > instead of monthly — could strike the right balance and make these >>> builds >>> > > more viable for production environments. >>> > > >>> > > If this model proves successful, we could move toward an even faster >>> major >>> > > release cadence and designate one LTS versions annually, with >>> extended >>> > > backport support. >>> > > >>> > > Benefits for the OSS Community: >>> > > >>> > > - Faster time-to-production for new features >>> > > - Stronger contributor engagement >>> > > - Quicker community feedback cycles >>> > > - Easier debugging and testing through smaller, incremental >>> changes >>> > > >>> > > >>> > > Thanks, >>> > > >>> > > DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 >>> > > >>> > > On Jul 1, 2025, at 2:17 PM, Dongjoon Hyun <dongj...@apache.org> >>> wrote: >>> > > >>> > > Thank you so much for the suggestion and achieving the automated >>> infra, >>> > > Hyujkin. >>> > > >>> > > I have a few questions. >>> > > >>> > > 1. Since the SPIP suggests Apache Spark 5.0 ("Stable major >>> releases") as >>> > > "Q8. Final Success" criteria. I'm wondering if you have some >>> schedule in >>> > > your mind for Spark 5.0 in next 2 years? >>> > > >>> > > 2. Are we going to lower the bar for the monthly preview releases? >>> > > Specifically, I'm wondering if the Preview-RC1 supposed to pass >>> always >>> > > because it's a preview release? As we know, it's not until now. For >>> > > example, we had three RCs for `4.0.0-preview1` like "[VOTE] SPARK >>> > > 4.0.0-preview1 (RC3)". >>> > > >>> > > 3. Is SPIP proposing monthly previews for only the FIRST MINOR >>> versions >>> > > like Spark 4.1.0? For example, 4.1.0-preview1 and 4.2.0-preview1? >>> There is >>> > > no `4.0.1-preview1`? >>> > > >>> > > 4. Although there was an automated test email for 3.5.7, SPIP is not >>> > > aiming maintenance version release like 4.0.1 and 3.5.7? >>> > > >>> > > Initially, I thought you were going to propose (4) "Automated Monthly >>> > > Maintenance Release". >>> > > >>> > > For me, (4) is more beneficial than this SPIP because >>> > > - We spend the same community effort during voting (including at >>> least 3 >>> > > PMC votes) for (4) and SPIP. >>> > > - (4) has the real benefit because users can use it in the production >>> > > while SPIP didn't. >>> > > >>> > > Technically, it could be a little weird if Apache Spark community >>> releases >>> > > only "4.1.0-preview1", ..., "4.1.0-previewX" without delivering the >>> actual >>> > > maintenance versions like `4.0.1`, ..., `4.0.2`. >>> > > >>> > > In short, "Automated Monthly Maintenance Release" might be the >>> > > prerequisite for "Monthly Preview Release". What do you think about >>> that? >>> > > Can we extend your SPIP in this direction? >>> > > >>> > > Thanks, >>> > > Dongjoon. >>> > > >>> > > On 2025/06/30 23:34:54 Hyukjin Kwon wrote: >>> > > >>> > > Hi all, >>> > > >>> > > I would like to propose a monthly preview for our dev branch, e.g., >>> Spark >>> > > 4.1.0 preview1 ... previewN. >>> > > >>> > > Per https://issues.apache.org/jira/browse/SPARK-52176, we have >>> minimized >>> > > the manual work so I think it's realistic to propose this. >>> > > >>> > > Couple of notes: >>> > > - The manual steps it requires would be to run GitHub Actions twice >>> for RC >>> > > and publishing, and summarizing the vote result. There IS a way to >>> even >>> > > automate this but it needs more work to comply with ASF policy. I >>> would >>> > > like to stick to this minimal manual work for now. >>> > > - For now, I would like to volunteer to be responsible for the >>> preview >>> > > releases and incrementally improve our release policy guidelines ( >>> > > https://spark.apache.org/release-process.html) as well once this >>> SPIP >>> > > passes. >>> > > - The individual release would be, I suspect, about the first week >>> in each >>> > > month but I would like to avoid setting the explicit date in the >>> SPIP so it >>> > > makes us less pressured. >>> > > >>> > > JIRA: https://issues.apache.org/jira/browse/SPARK-52625 >>> > > SPIP: >>> > > >>> > > >>> https://docs.google.com/document/d/1ysJ16z_NUfIdsYqq1Qq7k8htmMWFpo8kXqX-8lGzCGc/edit?tab=t.0#heading=h.89yty49abp67 >>> > > >>> > > >>> > > --------------------------------------------------------------------- >>> > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> > > >>> > > >>> > > >>> > >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>>