Dongjoon, it is your responsibility to clarify your vote position since the vote is stalled as some people still claim your vote is veto. If you are really agreeing that I gained the consensus in the proper way, and your vote is really just for historical record, let's not waste more time by explicitly casting -0.99.
If you say I misunderstood your reply and you are still casting a veto, I'm happy to hear the evidence based on the history. We only talked from Github PR and mailing list, so none of the discussion happened except that infrastructure. I do not count any discussion happening in private@, as private@ is not meant to be used for discussion which could have been done in public. On Sat, Mar 15, 2025 at 11:21 PM Jungtaek Lim <kabhwan.opensou...@gmail.com> wrote: > small missing on link: > > 4. I claimed I wanted to proceed with migration logic for branch-4.0 PR, > and hadn't got any feedback except being told to wait for Spark 3.5.5 ( > link > <https://github.com/apache/spark/pull/49983#pullrequestreview-2621947671>). > If you weren't open to my proposal, you should have just said "we were > already decided" and you had to give the evidence. I haven't heard any, so > I had to initiate DISCUSS. > > > > On Sat, Mar 15, 2025 at 11:18 PM Jungtaek Lim < > kabhwan.opensou...@gmail.com> wrote: > >> > according to the ASF process, the Apache Spark community made the >> conclusion to unblock the Apache Spark 4.0.0 release with the AS-IS code >> with the improved Spark 4.0 migration guide because I provided a technical >> justification for my vote via the concrete alternative based on the >> existing Spark 3.5.5, AS-IS code base, and the suggested better migration >> guide way in order to eliminate the affected streaming queries. >> >> I can always be corrected if you give the evidence. Let's stop "just" >> talking. I believe we are seeing quite different things and our memory is >> quite opposite. "History will tell us." >> >> I am trying to understand where the miscommunication came from. Some >> clarification: >> >> 1. I believe I have said I do not agree just removing the config in >> master/4.0 and I expected follow-up, which is the migration logic. (link >> <https://github.com/apache/spark/pull/49897#issuecomment-2652486115>) I >> admit this is a bit unclear to understand, but I had multiple times to make >> my voice clear, otherwise I shouldn't ever have migration logic PR for >> master/4.0. >> 2. I believe I have said my intention is to land the migration logic to >> 4.0.x and arguably longer (link >> <https://lists.apache.org/thread/q24vonqhvqh11ghd488rctsm89zvmpqd>). >> I think there were people who were wanting to remove the vendor name in >> any way, but arguably it just ended with an open question, never to be >> reached consensus. People expressed concerns, but nothing was concluded >> except we agree with proceeding for Spark 3.5.5. We never made a consensus >> on how to deal with it in Spark 4.0.0+ in that discussion thread, >> especially about migration logic. >> 3. VOTE for removal of config is clearly stated that it is only 3.5. ( >> link <https://lists.apache.org/thread/6nn76olr65b8zfgzdcbtr9f6o98451o5>) >> 4. I claimed I wanted to proceed with migration logic for branch-4.0 PR, >> and hadn't got any feedback except being told to wait for Spark 3.5.5 >> (link). If you weren't open to my proposal, you should have just said "we >> were already decided" and you had to give the evidence. I haven't heard >> any, so I had to initiate DISCUSS. >> 5. We all know about DISCUSS and VOTE so I wouldn't repeat. >> >> I have strong evidence that you were aware of the fact we never agreed >> with the behavior for Spark 4.0.0, and you said my proposal is "technically >> correct", so we had never debated about "technical objection", but debated >> about "behavior". >> https://github.com/apache/spark/pull/49983#issuecomment-2676531485 >> >> Can you please explain why you said my proposal is "technically correct" >> and here you did a vote which required "technical objection"? Have you >> changed your mind? >> >> Overall, when you say "the Apache Spark community made the conclusion to >> unblock the Apache Spark 4.0.0 release with the AS-IS code", I don't get >> who is "the Apache Spark community". Where can I see the DISCUSS and VOTE >> thread? Is it really that I am excluded on the list of the Apache Spark >> community, while arguably I am the only active maintainer of the module? >> Could you please enumerate who the Apache Spark community was at that time? >> >> Let's not talk based on memory. If we agree about that, we should have a >> history. I am open to apologize if I missed a critical discussion and vote. >> Your (and my) memory should never be used as evidence. Please, give the >> evidence. >> >> I'm also happy to hear about the other thread I have made. Thanks. >> >> On Sat, Mar 15, 2025 at 9:23 AM Dongjoon Hyun <dongjoon.h...@gmail.com> >> wrote: >> >>> Apache Spark PMC always strongly recommends all 3.5 users to upgrade to >>> the latest stable release via the official website. The main question seems >>> quite different from the Apache Spark website. May I ask what is not safe >>> to guide Spark 3.5.4 users to 3.5.5, Jungtaek? >>> >>> > The main question was, "where is the evidence it's safe to force users >>> to upgrade to Spark 3.5.5... >>> >>> For the following part, when the Apache Spark community made a mistake >>> at Spark 2.4.2 release, we guided the users to upgrade to 2.4.3 immediately >>> after recovering the default Scala version to 2.11. >>> >>> > to upgrade to Spark 3.5.5 before upgrading to Spark 4.0.0". >>> >>> 2019-04-23 https://spark.apache.org/releases/spark-release-2-4-2.html >>> 2019-05-08 https://spark.apache.org/releases/spark-release-2-4-3.html >>> >>> In the same way, Apache Spark 3.5.5 was released and is ready to handle >>> a mistake at Spark 3.5.4. >>> >>> 2025-02-27 https://spark.apache.org/releases/spark-release-3-5-5.html >>> >>> For the vote, the vote is a time-limited procedure to make a swift >>> decision. That's the reason why you proposed the vote procedure and we >>> agreed. There is no way to `block` the votes. The vote itself is already >>> completed (including my -1). >>> >>> > you weren’t intended to “block” the vote >>> >>> I've been considering this as a part of the whole `spark.databricks.*` >>> incident handling. In my interpretation, according to the ASF process, the >>> Apache Spark community made the conclusion to unblock the Apache Spark >>> 4.0.0 release with the AS-IS code with the improved Spark 4.0 migration >>> guide because I provided a technical justification for my vote via the >>> concrete alternative based on the existing Spark 3.5.5, AS-IS code base, >>> and the suggested better migration guide way in order to eliminate the >>> affected streaming queries. >>> >>> Thanks, >>> Dongjoon. >>> >>> >>> >>> On Fri, Mar 14, 2025 at 3:23 AM Jungtaek Lim < >>> kabhwan.opensou...@gmail.com> wrote: >>> >>>> That said, if I understand correctly, you weren’t intended to “block” >>>> the vote, right? You say you expected the vote to be finished. >>>> >>>> Could you please cast the vote to -0.x since some people views this as >>>> code change vote, or clarify explicitly that you think this is not a code >>>> change vote? This will help resolve the concerns from some PMC members >>>> about how we should interpret the vote result clearly. >>>> >>>> Thanks! >>>> >>>> 2025년 3월 14일 (금) 오후 5:33, Dongjoon Hyun <dongjoon.h...@gmail.com>님이 작성: >>>> >>>>> Thank you all. >>>>> >>>>> The vote is finished in an intended way with the expected result. We >>>>> have enough time to discuss and I have been sticking to my original >>>>> technical justification from the beginning (including this). >>>>> >>>>> 1. Helping renaming the conf via SPARK-51172 (by approving it) >>>>> 2. Banning `spark.databricks.*` via SPARK-51173 (by adding >>>>> `configName` Scalastyle rule) >>>>> 3. Led the discussion thread and reached the agreement to release >>>>> Spark 3.5.5 early. >>>>> 4. Releasing 3.5.5 as a release manager to provide a candidate >>>>> migration path >>>>> 5. Proposing to use the migration path >>>>> >>>>> This vote was Step 5. My technical point has always been aiming to >>>>> recover the Apache Spark 4 codebase to the status before our mistake by >>>>> containing the issue only in `branch-3.5` and providing the proposed >>>>> narrow >>>>> migration path. And, as mentioned already, that's the situation where we >>>>> were during the vote at Apache Spark AS-IS branches. What all of us agree >>>>> on is that the previous code base is okay. I didn't reply to >>>>> Jungtaek's Apple comment intentionally because it's not a public >>>>> Spark-vendor like Databricks. And, it's a product name of the popular >>>>> consumer electronic devices like Intel/AMD/Graviton. In addition, I don't >>>>> think we are going to add back `spark.databricks.*` because of the reason >>>>> the customers ask for it. In the same way, this vote is one of the >>>>> political decision making processes of Apache Spark PMC. We started this >>>>> vote because we couldn't make a consensus. >>>>> >>>>> I believe I've been providing all my best to the Apache Spark >>>>> community by actions and with valid technical clarification (without no >>>>> modification during the process). >>>>> >>>>> Sincerely, >>>>> Dongjoon >>>>> >>>>> >>>>> On Thu, Mar 13, 2025 at 11:41 PM Mridul Muralidharan <mri...@gmail.com> >>>>> wrote: >>>>> >>>>>> >>>>>> FWIW, I am +1 on the proposal (though I missed the vote on this !) >>>>>> >>>>>> Regards, >>>>>> Mridul >>>>>> >>>>>> On Fri, Mar 14, 2025 at 1:31 AM Mridul Muralidharan <mri...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> I agree with Mark, imo this is a qualified veto. >>>>>>> We should give Dongjoon the opportunity to give his clarification, >>>>>>> if any. >>>>>>> >>>>>>> I do realize this delays the RC process, but this deserves to be >>>>>>> looked into carefully. >>>>>>> >>>>>>> Thanks, >>>>>>> Mridul >>>>>>> >>>>>>> >>>>>>> On Thu, Mar 13, 2025 at 9:35 PM Mark Hamstra <markhams...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Absolutely not! >>>>>>>> >>>>>>>> This is clearly a vote on a code change, not on a procedural issue >>>>>>>> or >>>>>>>> a package release. The code change has been vetoed by a -1 vote by a >>>>>>>> qualified voter. >>>>>>>> >>>>>>>> On Thu, Mar 13, 2025 at 6:58 PM Jungtaek Lim >>>>>>>> <kabhwan.opensou...@gmail.com> wrote: >>>>>>>> > >>>>>>>> > Likewise I said, I'm concluding the VOTE since we ensure the >>>>>>>> criteria (3 +1 binding, 1 -1 binding, and also +1s from non-binding). >>>>>>>> > >>>>>>>> > I don't consider -1 as a veto as I explained, as we should have >>>>>>>> multiple -1s if we go for VOTE with the current codebase. (+1 in this >>>>>>>> proposal is effectively -1 in another proposal.) >>>>>>>> > >>>>>>>> > The vote followed the Apache Voting Process with the type of >>>>>>>> "package release" (which we tend to use in dev@ for VOTE). I guess >>>>>>>> it could have also done with "procedural issues" which is less strict, >>>>>>>> but >>>>>>>> then this fulfills both types of votes which should be OK. >>>>>>>> > >>>>>>>> > The current codebase is "accidentally" representing another >>>>>>>> proposal and it is never intended. I don't find the way I can -1 to the >>>>>>>> current codebase, and make a different change neither bound to any >>>>>>>> proposal >>>>>>>> to be fair. >>>>>>>> > >>>>>>>> > I don't want to block the release because of the above. So, let's >>>>>>>> change the current codebase the way we discussed and voted here. >>>>>>>> Reverting >>>>>>>> this decision should require another VOTE. >>>>>>>> > >>>>>>>> > Thanks to everyone who voted! >>>>>>>> > >>>>>>>> > On Thu, Mar 13, 2025 at 4:54 PM Jungtaek Lim < >>>>>>>> kabhwan.opensou...@gmail.com> wrote: >>>>>>>> >> >>>>>>>> >> Thanks to everyone who participated and voted! >>>>>>>> >> >>>>>>>> >> Now I can technically conclude the VOTE, but I'm willing to wait >>>>>>>> till US daytime tomorrow, to give some time for Dongjoon to revisit >>>>>>>> this. >>>>>>>> >> >>>>>>>> >> I'll conclude the vote around 6PM PST tomorrow regardless of his >>>>>>>> vote. It's ideal to see us have no -1, but having one -1 doesn't block >>>>>>>> this >>>>>>>> vote and we can move forward. >>>>>>>> >> >>>>>>>> >> On Thu, Mar 13, 2025 at 4:42 PM Yang Jie <yangji...@apache.org> >>>>>>>> wrote: >>>>>>>> >>> >>>>>>>> >>> forgot to mention in my last reply, my stance is +1 >>>>>>>> >>> >>>>>>>> >>> Jie Yang >>>>>>>> >>> >>>>>>>> >>> On 2025/03/13 07:08:12 Russell Jurney wrote: >>>>>>>> >>> > Sure, +1 non-binding. >>>>>>>> >>> > >>>>>>>> >>> > On Wed, Mar 12, 2025 at 11:18 PM Jungtaek Lim < >>>>>>>> kabhwan.opensou...@gmail.com> >>>>>>>> >>> > wrote: >>>>>>>> >>> > >>>>>>>> >>> > > Russell, >>>>>>>> >>> > > >>>>>>>> >>> > > Of course, we hear people' voices who aren't having binding >>>>>>>> votes as well. >>>>>>>> >>> > > Personally I think it's more important than committers/PMC >>>>>>>> members' VOTE >>>>>>>> >>> > > this time since we can be biased and be far from user >>>>>>>> experience. >>>>>>>> >>> > > >>>>>>>> >>> > > Could you please explicitly cast your vote, like +1 >>>>>>>> (non-binding)? You >>>>>>>> >>> > > seem to agree with the proposal. Thanks! >>>>>>>> >>> > > >>>>>>>> >>> > > On Thu, Mar 13, 2025 at 3:15 PM Russell Jurney < >>>>>>>> russell.jur...@gmail.com> >>>>>>>> >>> > > wrote: >>>>>>>> >>> > > >>>>>>>> >>> > >> I'm just a lurker and aspiring contributor, but as a Spark >>>>>>>> user upgrading >>>>>>>> >>> > >> twice is very confusing and would cause many or most users >>>>>>>> to fail to >>>>>>>> >>> > >> upgrade successfully to Spark 4 on a first go. That seems >>>>>>>> like a very bad >>>>>>>> >>> > >> user experience. I thought it was worthwhile stating this >>>>>>>> out loud. >>>>>>>> >>> > >> >>>>>>>> >>> > >> Russell >>>>>>>> >>> > >> >>>>>>>> >>> > >> On Wed, Mar 12, 2025 at 11:05 PM Xiao Li < >>>>>>>> gatorsm...@gmail.com> wrote: >>>>>>>> >>> > >> >>>>>>>> >>> > >>> this vote is to allow streaming queries which had been >>>>>>>> ever run in Spark >>>>>>>> >>> > >>>> 3.5.4 to be upgraded with Spark 4.0.x, "without having >>>>>>>> to be upgraded with >>>>>>>> >>> > >>>> Spark 3.5.5+ in prior". >>>>>>>> >>> > >>> >>>>>>>> >>> > >>> >>>>>>>> >>> > >>> In the history of Apache Spark, have we ever required >>>>>>>> users to upgrade >>>>>>>> >>> > >>> to the next maintenance release before moving to a new >>>>>>>> feature or major >>>>>>>> >>> > >>> release? >>>>>>>> >>> > >>> >>>>>>>> >>> > >>> Xiao >>>>>>>> >>> > >>> >>>>>>>> >>> > >>> Adam Binford <adam...@gmail.com> 于2025年3月11日周二 09:08写道: >>>>>>>> >>> > >>> >>>>>>>> >>> > >>>> +1 (non-binding) >>>>>>>> >>> > >>>> >>>>>>>> >>> > >>>> It's a pretty in the weeds issue with how Structured >>>>>>>> Streaming works >>>>>>>> >>> > >>>> under the hood that's kinda hard to understand if you're >>>>>>>> not familiar with >>>>>>>> >>> > >>>> it. The migration logic doesn't mean users can still use >>>>>>>> the old config, >>>>>>>> >>> > >>>> it's purely behind the scenes to fix checkpoint metadata >>>>>>>> in streams created >>>>>>>> >>> > >>>> in 3.5.4. The 5 lines of code it takes to address a >>>>>>>> weird edge case for >>>>>>>> >>> > >>>> certain users that's already gone from master shouldn't >>>>>>>> be a huge deal. >>>>>>>> >>> > >>>> >>>>>>>> >>> > >>>> On Tue, Mar 11, 2025 at 1:43 AM Yang Jie < >>>>>>>> yangji...@apache.org> wrote: >>>>>>>> >>> > >>>> >>>>>>>> >>> > >>>>> >>>>>>>> >>> > >>>>> To Sean, you're right, I'm very sorry. >>>>>>>> >>> > >>>>> >>>>>>>> >>> > >>>>> From the perspective of compatibility and >>>>>>>> migratability, I think we >>>>>>>> >>> > >>>>> should migrate this logic to 4.0.0 and keep it in the >>>>>>>> codebase for a longer >>>>>>>> >>> > >>>>> time (or permanently), because we can't predict which >>>>>>>> version users of >>>>>>>> >>> > >>>>> 3.5.4 will choose next. >>>>>>>> >>> > >>>>> >>>>>>>> >>> > >>>>> >>>>>>>> >>> > >>>>> I don't want to discuss the so-called vendor issue. >>>>>>>> >>> > >>>>> >>>>>>>> >>> > >>>>> I withdraw my previous -1. >>>>>>>> >>> > >>>>> >>>>>>>> >>> > >>>>> Jie Yang. >>>>>>>> >>> > >>>>> >>>>>>>> >>> > >>>>> On 2025/03/11 04:42:25 Wenchen Fan wrote: >>>>>>>> >>> > >>>>> > Guys, let’s be honest about what we’re discussing >>>>>>>> here. >>>>>>>> >>> > >>>>> > >>>>>>>> >>> > >>>>> > If this is a migration issue, why would we even need >>>>>>>> a vote? We’ve >>>>>>>> >>> > >>>>> been >>>>>>>> >>> > >>>>> > consistently adding configurations to restore legacy >>>>>>>> behavior >>>>>>>> >>> > >>>>> instead of >>>>>>>> >>> > >>>>> > removing them because we understand the challenges of >>>>>>>> upgrading Spark >>>>>>>> >>> > >>>>> > versions. Our goal has always been to make upgrades >>>>>>>> easier, even if >>>>>>>> >>> > >>>>> it >>>>>>>> >>> > >>>>> > means carrying some technical debt. I don’t think we >>>>>>>> want to change >>>>>>>> >>> > >>>>> that >>>>>>>> >>> > >>>>> > culture now. >>>>>>>> >>> > >>>>> > >>>>>>>> >>> > >>>>> > If the concern is about vendor names appearing in the >>>>>>>> codebase, then >>>>>>>> >>> > >>>>> why is >>>>>>>> >>> > >>>>> > it a big deal this time when vendor names are already >>>>>>>> present >>>>>>>> >>> > >>>>> elsewhere? If >>>>>>>> >>> > >>>>> > we’ve failed to follow a policy, let’s correct it, >>>>>>>> but can someone >>>>>>>> >>> > >>>>> point to >>>>>>>> >>> > >>>>> > the specific policy we’re violating? >>>>>>>> >>> > >>>>> > >>>>>>>> >>> > >>>>> > If the vote is about adding migration logic to ease >>>>>>>> the upgrade from >>>>>>>> >>> > >>>>> 3.5.4 >>>>>>>> >>> > >>>>> > to 4.0.0, then +1, why not? >>>>>>>> >>> > >>>>> > >>>>>>>> >>> > >>>>> > Thanks, >>>>>>>> >>> > >>>>> > Wenchen >>>>>>>> >>> > >>>>> > >>>>>>>> >>> > >>>>> > >>>>>>>> >>> > >>>>> > >>>>>>>> >>> > >>>>> > On Mon, Mar 10, 2025 at 8:49 PM Jungtaek Lim < >>>>>>>> >>> > >>>>> kabhwan.opensou...@gmail.com> >>>>>>>> >>> > >>>>> > wrote: >>>>>>>> >>> > >>>>> > >>>>>>>> >>> > >>>>> > > Well said, Sean. Sorry I made you keep around here >>>>>>>> since it might >>>>>>>> >>> > >>>>> not be >>>>>>>> >>> > >>>>> > > clearly stated. My bad. >>>>>>>> >>> > >>>>> > > >>>>>>>> >>> > >>>>> > > Yang, how could we ever tolerate the fact there are >>>>>>>> "other" >>>>>>>> >>> > >>>>> occurrences of >>>>>>>> >>> > >>>>> > > vendor names in the codebase? Please go and search >>>>>>>> "databricks" in >>>>>>>> >>> > >>>>> the >>>>>>>> >>> > >>>>> > > codebase and be surprised. >>>>>>>> >>> > >>>>> > > >>>>>>>> >>> > >>>>> > > If we believe that having vendor names in the >>>>>>>> codebase will >>>>>>>> >>> > >>>>> increase >>>>>>>> >>> > >>>>> > > the occurrence of making mistakes, why didn't we >>>>>>>> have a discussion >>>>>>>> >>> > >>>>> thread >>>>>>>> >>> > >>>>> > > earlier to remove all occurrences altogether? This >>>>>>>> is super tricky >>>>>>>> >>> > >>>>> because >>>>>>>> >>> > >>>>> > > I can even start to argue we have "Apple" as a >>>>>>>> vendor name in >>>>>>>> >>> > >>>>> Apache Spark >>>>>>>> >>> > >>>>> > > codebase. I'm not saying we use "apple" in the test >>>>>>>> data. See >>>>>>>> >>> > >>>>> > > `isMacOnAppleSilicon` in Utils. Is it unavoidable? >>>>>>>> No, >>>>>>>> >>> > >>>>> `isMacOnMSeries` or >>>>>>>> >>> > >>>>> > > `isMacOnSilicon` is enough. >>>>>>>> >>> > >>>>> > > >>>>>>>> >>> > >>>>> > > We really need to draw a line where we disallow >>>>>>>> vendor names on it >>>>>>>> >>> > >>>>> - if >>>>>>>> >>> > >>>>> > > it's the entire codebase, I don't really think it >>>>>>>> is realistic. >>>>>>>> >>> > >>>>> > > >>>>>>>> >>> > >>>>> > > This was really a mistake, and it was definitely >>>>>>>> not from >>>>>>>> >>> > >>>>> referring to the >>>>>>>> >>> > >>>>> > > existing codebase. Not having a vendor name does >>>>>>>> not change >>>>>>>> >>> > >>>>> anything on the >>>>>>>> >>> > >>>>> > > chance of encountering this issue again. If we >>>>>>>> really care, we >>>>>>>> >>> > >>>>> should think >>>>>>>> >>> > >>>>> > > about style checking, which is the only viable way >>>>>>>> to catch the >>>>>>>> >>> > >>>>> mistake. >>>>>>>> >>> > >>>>> > > Again, I'd argue we have to have a bunch of vendor >>>>>>>> names in that >>>>>>>> >>> > >>>>> style >>>>>>>> >>> > >>>>> > > check, not just the problematic vendor name. >>>>>>>> >>> > >>>>> > > >>>>>>>> >>> > >>>>> > > >>>>>>>> >>> > >>>>> > > On Tue, Mar 11, 2025 at 12:17 PM Sean Owen < >>>>>>>> sro...@gmail.com> >>>>>>>> >>> > >>>>> wrote: >>>>>>>> >>> > >>>>> > > >>>>>>>> >>> > >>>>> > >> Doesn't the migration code 'clear' the debt? >>>>>>>> >>> > >>>>> > >> The proposal is not to continue to support the >>>>>>>> config. >>>>>>>> >>> > >>>>> > >> I feel like people are not quite understanding the >>>>>>>> change, and >>>>>>>> >>> > >>>>> objecting >>>>>>>> >>> > >>>>> > >> to something that doesn't exist. >>>>>>>> >>> > >>>>> > >> It's a shame, as this seems like something not >>>>>>>> even worth >>>>>>>> >>> > >>>>> discussing. I >>>>>>>> >>> > >>>>> > >> don't know why this triggered this much >>>>>>>> discussion. We have kept >>>>>>>> >>> > >>>>> deprecated >>>>>>>> >>> > >>>>> > >> methods without blinking, which is in comparison >>>>>>>> much bigger. >>>>>>>> >>> > >>>>> > >> Can we maybe ask you review the actual change in >>>>>>>> question? >>>>>>>> >>> > >>>>> > >> >>>>>>>> >>> > >>>>> > >> On Mon, Mar 10, 2025, 10:02 PM Yang Jie < >>>>>>>> yangji...@apache.org> >>>>>>>> >>> > >>>>> wrote: >>>>>>>> >>> > >>>>> > >> >>>>>>>> >>> > >>>>> > >>> -1 >>>>>>>> >>> > >>>>> > >>> Remove migration logic of incorrect >>>>>>>> `spark.databricks.*` >>>>>>>> >>> > >>>>> configuration >>>>>>>> >>> > >>>>> > >>> in Spark 4.0.0 because I think this configuration >>>>>>>> was initially >>>>>>>> >>> > >>>>> introduced >>>>>>>> >>> > >>>>> > >>> accidentally in Spark 3.5.4, lacking a clear >>>>>>>> design intent. >>>>>>>> >>> > >>>>> Although the >>>>>>>> >>> > >>>>> > >>> immediate maintenance cost of retaining this >>>>>>>> configuration >>>>>>>> >>> > >>>>> currently seems >>>>>>>> >>> > >>>>> > >>> limited, as subsequent versions iterate and user >>>>>>>> habits form, it >>>>>>>> >>> > >>>>> may lead >>>>>>>> >>> > >>>>> > >>> to the continuous accumulation of technical debt. >>>>>>>> When users >>>>>>>> >>> > >>>>> come to view >>>>>>>> >>> > >>>>> > >>> this configuration as one that can be relied on >>>>>>>> long-term, >>>>>>>> >>> > >>>>> future removal >>>>>>>> >>> > >>>>> > >>> may face greater resistance from users and could >>>>>>>> potentially >>>>>>>> >>> > >>>>> become an >>>>>>>> >>> > >>>>> > >>> entrenched and redundant configuration in the >>>>>>>> codebase. >>>>>>>> >>> > >>>>> Therefore, promptly >>>>>>>> >>> > >>>>> > >>> correcting this historically accidental >>>>>>>> configuration not only >>>>>>>> >>> > >>>>> maintains >>>>>>>> >>> > >>>>> > >>> the normativity of the Spark configuration system >>>>>>>> but also >>>>>>>> >>> > >>>>> prevents >>>>>>>> >>> > >>>>> > >>> unintended configurations from becoming de facto >>>>>>>> standards, >>>>>>>> >>> > >>>>> thereby >>>>>>>> >>> > >>>>> > >>> reducing long-term maintenance risks. >>>>>>>> >>> > >>>>> > >>> >>>>>>>> >>> > >>>>> > >>> Jie Yang >>>>>>>> >>> > >>>>> > >>> >>>>>>>> >>> > >>>>> > >>> On 2025/03/10 14:52:52 Dongjoon Hyun wrote: >>>>>>>> >>> > >>>>> > >>> > -1 because there exists a feasible migration >>>>>>>> path for Apache >>>>>>>> >>> > >>>>> Spark >>>>>>>> >>> > >>>>> > >>> 3.5.4 via Apache Spark 3.5.5. >>>>>>>> >>> > >>>>> > >>> > >>>>>>>> >>> > >>>>> > >>> > It's obvious that this Databricks' mistake >>>>>>>> already causes a >>>>>>>> >>> > >>>>> huge >>>>>>>> >>> > >>>>> > >>> communication cost in the Apache Spark community >>>>>>>> and is >>>>>>>> >>> > >>>>> suggesting a burden >>>>>>>> >>> > >>>>> > >>> to enforce us to handle at least two more PRs at >>>>>>>> 4.0.0 and 4.1.0. >>>>>>>> >>> > >>>>> > >>> > >>>>>>>> >>> > >>>>> > >>> > Given that, I don't think >>>>>>>> >>> > >>>>> > >>> > - This is an inevitable or >>>>>>>> >>> > >>>>> > >>> > - This is 0 cost >>>>>>>> >>> > >>>>> > >>> > >>>>>>>> >>> > >>>>> > >>> > Dongjoon. >>>>>>>> >>> > >>>>> > >>> > >>>>>>>> >>> > >>>>> > >>> > On 2025/03/10 12:46:16 Jungtaek Lim wrote: >>>>>>>> >>> > >>>>> > >>> > > Starting from my +1 (non-binding). >>>>>>>> >>> > >>>>> > >>> > > >>>>>>>> >>> > >>>>> > >>> > > In addition, I propose to retain migration >>>>>>>> logic till Spark >>>>>>>> >>> > >>>>> 4.1.x and >>>>>>>> >>> > >>>>> > >>> > > remove it in Spark 4.2.0. >>>>>>>> >>> > >>>>> > >>> > > >>>>>>>> >>> > >>>>> > >>> > > On Mon, Mar 10, 2025 at 9:44 PM Jungtaek Lim < >>>>>>>> >>> > >>>>> > >>> kabhwan.opensou...@gmail.com> >>>>>>>> >>> > >>>>> > >>> > > wrote: >>>>>>>> >>> > >>>>> > >>> > > >>>>>>>> >>> > >>>>> > >>> > > > Hi dev, >>>>>>>> >>> > >>>>> > >>> > > > >>>>>>>> >>> > >>>>> > >>> > > > Please vote to retain migration logic of >>>>>>>> incorrect >>>>>>>> >>> > >>>>> > >>> `spark.databricks.*` >>>>>>>> >>> > >>>>> > >>> > > > configuration in Spark 4.0.x. >>>>>>>> >>> > >>>>> > >>> > > > >>>>>>>> >>> > >>>>> > >>> > > > - DISCUSSION: >>>>>>>> >>> > >>>>> > >>> > > > >>>>>>>> >>> > >>>>> >>>>>>>> https://lists.apache.org/thread/xzk9729lsmo397crdtk14f74g8cyv4sr >>>>>>>> >>> > >>>>> > >>> > > > ([DISCUSS] Handling spark.databricks.* >>>>>>>> config being >>>>>>>> >>> > >>>>> exposed in >>>>>>>> >>> > >>>>> > >>> 3.5.4 in >>>>>>>> >>> > >>>>> > >>> > > > Spark 4.0.0+) >>>>>>>> >>> > >>>>> > >>> > > > >>>>>>>> >>> > >>>>> > >>> > > > Specifically, please review this post >>>>>>>> >>> > >>>>> > >>> > > > >>>>>>>> >>> > >>>>> >>>>>>>> https://lists.apache.org/thread/xtq1kjhsl4ohfon78z3wld2hmfm78t9k >>>>>>>> >>> > >>>>> > >>> which >>>>>>>> >>> > >>>>> > >>> > > > explains pros and cons about the proposal - >>>>>>>> proposal is >>>>>>>> >>> > >>>>> about >>>>>>>> >>> > >>>>> > >>> "Option 1". >>>>>>>> >>> > >>>>> > >>> > > > >>>>>>>> >>> > >>>>> > >>> > > > Simply speaking, this vote is to allow >>>>>>>> streaming queries >>>>>>>> >>> > >>>>> which had >>>>>>>> >>> > >>>>> > >>> been >>>>>>>> >>> > >>>>> > >>> > > > ever run in Spark 3.5.4 to be upgraded with >>>>>>>> Spark 4.0.x, >>>>>>>> >>> > >>>>> "without >>>>>>>> >>> > >>>>> > >>> having to >>>>>>>> >>> > >>>>> > >>> > > > be upgraded with Spark 3.5.5+ in prior". If >>>>>>>> the vote >>>>>>>> >>> > >>>>> passes, we >>>>>>>> >>> > >>>>> > >>> will help >>>>>>>> >>> > >>>>> > >>> > > > users to have a smooth upgrade from Spark >>>>>>>> 3.5.4 to Spark >>>>>>>> >>> > >>>>> 4.0.x, >>>>>>>> >>> > >>>>> > >>> which would >>>>>>>> >>> > >>>>> > >>> > > > be almost 1 year. >>>>>>>> >>> > >>>>> > >>> > > > >>>>>>>> >>> > >>>>> > >>> > > > The (only) cons in this option is having to >>>>>>>> retain the >>>>>>>> >>> > >>>>> incorrect >>>>>>>> >>> > >>>>> > >>> > > > configuration name as "string" in the >>>>>>>> codebase a bit >>>>>>>> >>> > >>>>> longer. The >>>>>>>> >>> > >>>>> > >>> code >>>>>>>> >>> > >>>>> > >>> > > > complexity of migration logic is arguably >>>>>>>> trivial. (link >>>>>>>> >>> > >>>>> > >>> > > > < >>>>>>>> >>> > >>>>> > >>> >>>>>>>> >>> > >>>>> >>>>>>>> https://github.com/apache/spark/blob/4231d58245251a34ae80a38ea4bbf7d720caa439/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeq.scala#L174-L183 >>>>>>>> >>> > >>>>> > >>> > >>>>>>>> >>> > >>>>> > >>> > > > ) >>>>>>>> >>> > >>>>> > >>> > > > >>>>>>>> >>> > >>>>> > >>> > > > This VOTE is for Spark 4.0.x, but if >>>>>>>> someone supports >>>>>>>> >>> > >>>>> including >>>>>>>> >>> > >>>>> > >>> migration >>>>>>>> >>> > >>>>> > >>> > > > logic to be longer than Spark 4.0.x, please >>>>>>>> cast +1 here >>>>>>>> >>> > >>>>> and leave >>>>>>>> >>> > >>>>> > >>> the >>>>>>>> >>> > >>>>> > >>> > > > desired last minor version of Spark to >>>>>>>> retain this >>>>>>>> >>> > >>>>> migration logic. >>>>>>>> >>> > >>>>> > >>> > > > >>>>>>>> >>> > >>>>> > >>> > > > The vote is open for the next 72 hours and >>>>>>>> passes if a >>>>>>>> >>> > >>>>> majority +1 >>>>>>>> >>> > >>>>> > >>> PMC >>>>>>>> >>> > >>>>> > >>> > > > votes are cast, with a minimum of 3 +1 >>>>>>>> votes. >>>>>>>> >>> > >>>>> > >>> > > > >>>>>>>> >>> > >>>>> > >>> > > > [ ] +1 Retain migration logic of incorrect >>>>>>>> >>> > >>>>> `spark.databricks.*` >>>>>>>> >>> > >>>>> > >>> > > > configuration in Spark 4.0.x >>>>>>>> >>> > >>>>> > >>> > > > [ ] -1 Remove migration logic of incorrect >>>>>>>> >>> > >>>>> `spark.databricks.*` >>>>>>>> >>> > >>>>> > >>> > > > configuration in Spark 4.0.0 because... >>>>>>>> >>> > >>>>> > >>> > > > >>>>>>>> >>> > >>>>> > >>> > > > Thanks! >>>>>>>> >>> > >>>>> > >>> > > > Jungtaek Lim (HeartSaVioR) >>>>>>>> >>> > >>>>> > >>> > > > >>>>>>>> >>> > >>>>> > >>> > > >>>>>>>> >>> > >>>>> > >>> > >>>>>>>> >>> > >>>>> > >>> > >>>>>>>> >>> > >>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> >>> > >>>>> > >>> > To unsubscribe e-mail: >>>>>>>> dev-unsubscr...@spark.apache.org >>>>>>>> >>> > >>>>> > >>> > >>>>>>>> >>> > >>>>> > >>> > >>>>>>>> >>> > >>>>> > >>> >>>>>>>> >>> > >>>>> > >>> >>>>>>>> >>> > >>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> >>> > >>>>> > >>> To unsubscribe e-mail: >>>>>>>> dev-unsubscr...@spark.apache.org >>>>>>>> >>> > >>>>> > >>> >>>>>>>> >>> > >>>>> > >>> >>>>>>>> >>> > >>>>> > >>>>>>>> >>> > >>>>> >>>>>>>> >>> > >>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> >>> > >>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>>> >>> > >>>>> >>>>>>>> >>> > >>>>> >>>>>>>> >>> > >>>> >>>>>>>> >>> > >>>> -- >>>>>>>> >>> > >>>> Adam Binford >>>>>>>> >>> > >>>> >>>>>>>> >>> > >>> >>>>>>>> >>> > >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>>> >>> >>>>>>>> >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>>> >>>>>>>>