Dongjoon, it is your responsibility to clarify your vote position
since the vote is stalled as some people still claim your vote is veto. If
you are really agreeing that I gained the consensus in the proper way, and
your vote is really just for historical record, let's not waste more time
by explicitly casting -0.99.

If you say I misunderstood your reply and you are still casting a veto, I'm
happy to hear the evidence based on the history. We only talked from Github
PR and mailing list, so none of the discussion happened except that
infrastructure. I do not count any discussion happening in private@, as
private@ is not meant to be used for discussion which could have been done
in public.

On Sat, Mar 15, 2025 at 11:21 PM Jungtaek Lim <kabhwan.opensou...@gmail.com>
wrote:

> small missing on link:
>
> 4. I claimed I wanted to proceed with migration logic for branch-4.0 PR,
> and hadn't got any feedback except being told to wait for Spark 3.5.5 (
> link
> <https://github.com/apache/spark/pull/49983#pullrequestreview-2621947671>).
> If you weren't open to my proposal, you should have just said "we were
> already decided" and you had to give the evidence. I haven't heard any, so
> I had to initiate DISCUSS.
>
>
>
> On Sat, Mar 15, 2025 at 11:18 PM Jungtaek Lim <
> kabhwan.opensou...@gmail.com> wrote:
>
>> > according to the ASF process, the Apache Spark community made the
>> conclusion to unblock the Apache Spark 4.0.0 release with the AS-IS code
>> with the improved Spark 4.0 migration guide because I provided a technical
>> justification for my vote via the concrete alternative based on the
>> existing Spark 3.5.5, AS-IS code base, and the suggested better migration
>> guide way in order to eliminate the affected streaming queries.
>>
>> I can always be corrected if you give the evidence. Let's stop "just"
>> talking. I believe we are seeing quite different things and our memory is
>> quite opposite. "History will tell us."
>>
>> I am trying to understand where the miscommunication came from. Some
>> clarification:
>>
>> 1. I believe I have said I do not agree just removing the config in
>> master/4.0 and I expected follow-up, which is the migration logic. (link
>> <https://github.com/apache/spark/pull/49897#issuecomment-2652486115>) I
>> admit this is a bit unclear to understand, but I had multiple times to make
>> my voice clear, otherwise I shouldn't ever have migration logic PR for
>> master/4.0.
>> 2. I believe I have said my intention is to land the migration logic to
>> 4.0.x and arguably longer (link
>> <https://lists.apache.org/thread/q24vonqhvqh11ghd488rctsm89zvmpqd>).
>> I think there were people who were wanting to remove the vendor name in
>> any way, but arguably it just ended with an open question, never to be
>> reached consensus. People expressed concerns, but nothing was concluded
>> except we agree with proceeding for Spark 3.5.5. We never made a consensus
>> on how to deal with it in Spark 4.0.0+ in that discussion thread,
>> especially about migration logic.
>> 3. VOTE for removal of config is clearly stated that it is only 3.5. (
>> link <https://lists.apache.org/thread/6nn76olr65b8zfgzdcbtr9f6o98451o5>)
>> 4. I claimed I wanted to proceed with migration logic for branch-4.0 PR,
>> and hadn't got any feedback except being told to wait for Spark 3.5.5
>> (link). If you weren't open to my proposal, you should have just said "we
>> were already decided" and you had to give the evidence. I haven't heard
>> any, so I had to initiate DISCUSS.
>> 5. We all know about DISCUSS and VOTE so I wouldn't repeat.
>>
>> I have strong evidence that you were aware of the fact we never agreed
>> with the behavior for Spark 4.0.0, and you said my proposal is "technically
>> correct", so we had never debated about "technical objection", but debated
>> about "behavior".
>> https://github.com/apache/spark/pull/49983#issuecomment-2676531485
>>
>> Can you please explain why you said my proposal is "technically correct"
>> and here you did a vote which required "technical objection"? Have you
>> changed your mind?
>>
>> Overall, when you say "the Apache Spark community made the conclusion to
>> unblock the Apache Spark 4.0.0 release with the AS-IS code", I don't get
>> who is "the Apache Spark community". Where can I see the DISCUSS and VOTE
>> thread? Is it really that I am excluded on the list of the Apache Spark
>> community, while arguably I am the only active maintainer of the module?
>> Could you please enumerate who the Apache Spark community was at that time?
>>
>> Let's not talk based on memory. If we agree about that, we should have a
>> history. I am open to apologize if I missed a critical discussion and vote.
>> Your (and my) memory should never be used as evidence. Please, give the
>> evidence.
>>
>> I'm also happy to hear about the other thread I have made. Thanks.
>>
>> On Sat, Mar 15, 2025 at 9:23 AM Dongjoon Hyun <dongjoon.h...@gmail.com>
>> wrote:
>>
>>> Apache Spark PMC always strongly recommends all 3.5 users to upgrade to
>>> the latest stable release via the official website. The main question seems
>>> quite different from the Apache Spark website. May I ask what is not safe
>>> to guide Spark 3.5.4 users to 3.5.5, Jungtaek?
>>>
>>> > The main question was, "where is the evidence it's safe to force users
>>> to upgrade to Spark 3.5.5...
>>>
>>> For the following part, when the Apache Spark community made a mistake
>>> at Spark 2.4.2 release, we guided the users to upgrade to 2.4.3 immediately
>>> after recovering the default Scala version to 2.11.
>>>
>>> > to upgrade to Spark 3.5.5 before upgrading to Spark 4.0.0".
>>>
>>> 2019-04-23 https://spark.apache.org/releases/spark-release-2-4-2.html
>>> 2019-05-08 https://spark.apache.org/releases/spark-release-2-4-3.html
>>>
>>> In the same way, Apache Spark 3.5.5 was released and is ready to handle
>>> a mistake at Spark 3.5.4.
>>>
>>> 2025-02-27 https://spark.apache.org/releases/spark-release-3-5-5.html
>>>
>>> For the vote, the vote is a time-limited procedure to make a swift
>>> decision. That's the reason why you proposed the vote procedure and we
>>> agreed. There is no way to `block` the votes. The vote itself is already
>>> completed (including my -1).
>>>
>>> > you weren’t intended to “block” the vote
>>>
>>> I've been considering this as a part of the whole `spark.databricks.*`
>>> incident handling. In my interpretation, according to the ASF process, the
>>> Apache Spark community made the conclusion to unblock the Apache Spark
>>> 4.0.0 release with the AS-IS code with the improved Spark 4.0 migration
>>> guide because I provided a technical justification for my vote via the
>>> concrete alternative based on the existing Spark 3.5.5, AS-IS code base,
>>> and the suggested better migration guide way in order to eliminate the
>>> affected streaming queries.
>>>
>>> Thanks,
>>> Dongjoon.
>>>
>>>
>>>
>>> On Fri, Mar 14, 2025 at 3:23 AM Jungtaek Lim <
>>> kabhwan.opensou...@gmail.com> wrote:
>>>
>>>> That said, if I understand correctly, you weren’t intended to “block”
>>>> the vote, right? You say you expected the vote to be finished.
>>>>
>>>> Could you please cast the vote to -0.x since some people views this as
>>>> code change vote, or clarify explicitly that you think this is not a code
>>>> change vote? This will help resolve the concerns from some PMC members
>>>> about how we should interpret the vote result clearly.
>>>>
>>>> Thanks!
>>>>
>>>> 2025년 3월 14일 (금) 오후 5:33, Dongjoon Hyun <dongjoon.h...@gmail.com>님이 작성:
>>>>
>>>>> Thank you all.
>>>>>
>>>>> The vote is finished in an intended way with the expected result. We
>>>>> have enough time to discuss and I have been sticking to my original
>>>>> technical justification from the beginning (including this).
>>>>>
>>>>> 1. Helping renaming the conf via SPARK-51172 (by approving it)
>>>>> 2. Banning `spark.databricks.*` via SPARK-51173 (by adding
>>>>> `configName` Scalastyle rule)
>>>>> 3. Led the discussion thread and reached the agreement to release
>>>>> Spark 3.5.5 early.
>>>>> 4. Releasing 3.5.5 as a release manager to provide a candidate
>>>>> migration path
>>>>> 5. Proposing to use the migration path
>>>>>
>>>>> This vote was Step 5. My technical point has always been aiming to
>>>>> recover the Apache Spark 4 codebase to the status before our mistake by
>>>>> containing the issue only in `branch-3.5` and providing the proposed 
>>>>> narrow
>>>>> migration path. And, as mentioned already, that's the situation where we
>>>>> were during the vote at Apache Spark AS-IS branches. What all of us agree
>>>>> on is that the previous code base is okay. I didn't reply to
>>>>> Jungtaek's Apple comment intentionally because it's not a public
>>>>> Spark-vendor like Databricks. And, it's a product name of the popular
>>>>> consumer electronic devices like Intel/AMD/Graviton. In addition, I don't
>>>>> think we are going to add back `spark.databricks.*` because of the reason
>>>>> the customers ask for it. In the same way, this vote is one of the
>>>>> political decision making processes of Apache Spark PMC. We started this
>>>>> vote because we couldn't make a consensus.
>>>>>
>>>>> I believe I've been providing all my best to the Apache Spark
>>>>> community by actions and with valid technical clarification (without no
>>>>> modification during the process).
>>>>>
>>>>> Sincerely,
>>>>> Dongjoon
>>>>>
>>>>>
>>>>> On Thu, Mar 13, 2025 at 11:41 PM Mridul Muralidharan <mri...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> FWIW, I am +1 on the proposal (though I missed the vote on this !)
>>>>>>
>>>>>> Regards,
>>>>>> Mridul
>>>>>>
>>>>>> On Fri, Mar 14, 2025 at 1:31 AM Mridul Muralidharan <mri...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>   I agree with Mark, imo this is a qualified veto.
>>>>>>> We should give Dongjoon the opportunity to give his clarification,
>>>>>>> if any.
>>>>>>>
>>>>>>> I do realize this delays the RC process, but this deserves to be
>>>>>>> looked into carefully.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Mridul
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Mar 13, 2025 at 9:35 PM Mark Hamstra <markhams...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Absolutely not!
>>>>>>>>
>>>>>>>> This is clearly a vote on a code change, not on a procedural issue
>>>>>>>> or
>>>>>>>> a package release. The code change has been vetoed by a -1 vote by a
>>>>>>>> qualified voter.
>>>>>>>>
>>>>>>>> On Thu, Mar 13, 2025 at 6:58 PM Jungtaek Lim
>>>>>>>> <kabhwan.opensou...@gmail.com> wrote:
>>>>>>>> >
>>>>>>>> > Likewise I said, I'm concluding the VOTE since we ensure the
>>>>>>>> criteria (3 +1 binding, 1 -1 binding, and also +1s from non-binding).
>>>>>>>> >
>>>>>>>> > I don't consider -1 as a veto as I explained, as we should have
>>>>>>>> multiple -1s if we go for VOTE with the current codebase. (+1 in this
>>>>>>>> proposal is effectively -1 in another proposal.)
>>>>>>>> >
>>>>>>>> > The vote followed the Apache Voting Process with the type of
>>>>>>>> "package release" (which we tend to use in dev@ for VOTE). I guess
>>>>>>>> it could have also done with "procedural issues" which is less strict, 
>>>>>>>> but
>>>>>>>> then this fulfills both types of votes which should be OK.
>>>>>>>> >
>>>>>>>> > The current codebase is "accidentally" representing another
>>>>>>>> proposal and it is never intended. I don't find the way I can -1 to the
>>>>>>>> current codebase, and make a different change neither bound to any 
>>>>>>>> proposal
>>>>>>>> to be fair.
>>>>>>>> >
>>>>>>>> > I don't want to block the release because of the above. So, let's
>>>>>>>> change the current codebase the way we discussed and voted here. 
>>>>>>>> Reverting
>>>>>>>> this decision should require another VOTE.
>>>>>>>> >
>>>>>>>> > Thanks to everyone who voted!
>>>>>>>> >
>>>>>>>> > On Thu, Mar 13, 2025 at 4:54 PM Jungtaek Lim <
>>>>>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>>>>> >>
>>>>>>>> >> Thanks to everyone who participated and voted!
>>>>>>>> >>
>>>>>>>> >> Now I can technically conclude the VOTE, but I'm willing to wait
>>>>>>>> till US daytime tomorrow, to give some time for Dongjoon to revisit 
>>>>>>>> this.
>>>>>>>> >>
>>>>>>>> >> I'll conclude the vote around 6PM PST tomorrow regardless of his
>>>>>>>> vote. It's ideal to see us have no -1, but having one -1 doesn't block 
>>>>>>>> this
>>>>>>>> vote and we can move forward.
>>>>>>>> >>
>>>>>>>> >> On Thu, Mar 13, 2025 at 4:42 PM Yang Jie <yangji...@apache.org>
>>>>>>>> wrote:
>>>>>>>> >>>
>>>>>>>> >>> forgot to mention in my last reply, my stance is +1
>>>>>>>> >>>
>>>>>>>> >>> Jie Yang
>>>>>>>> >>>
>>>>>>>> >>> On 2025/03/13 07:08:12 Russell Jurney wrote:
>>>>>>>> >>> > Sure, +1 non-binding.
>>>>>>>> >>> >
>>>>>>>> >>> > On Wed, Mar 12, 2025 at 11:18 PM Jungtaek Lim <
>>>>>>>> kabhwan.opensou...@gmail.com>
>>>>>>>> >>> > wrote:
>>>>>>>> >>> >
>>>>>>>> >>> > > Russell,
>>>>>>>> >>> > >
>>>>>>>> >>> > > Of course, we hear people' voices who aren't having binding
>>>>>>>> votes as well.
>>>>>>>> >>> > > Personally I think it's more important than committers/PMC
>>>>>>>> members'  VOTE
>>>>>>>> >>> > > this time since we can be biased and be far from user
>>>>>>>> experience.
>>>>>>>> >>> > >
>>>>>>>> >>> > > Could you please explicitly cast your vote, like +1
>>>>>>>> (non-binding)? You
>>>>>>>> >>> > > seem to agree with the proposal. Thanks!
>>>>>>>> >>> > >
>>>>>>>> >>> > > On Thu, Mar 13, 2025 at 3:15 PM Russell Jurney <
>>>>>>>> russell.jur...@gmail.com>
>>>>>>>> >>> > > wrote:
>>>>>>>> >>> > >
>>>>>>>> >>> > >> I'm just a lurker and aspiring contributor, but as a Spark
>>>>>>>> user upgrading
>>>>>>>> >>> > >> twice is very confusing and would cause many or most users
>>>>>>>> to fail to
>>>>>>>> >>> > >> upgrade successfully to Spark 4 on a first go. That seems
>>>>>>>> like a very bad
>>>>>>>> >>> > >> user experience. I thought it was worthwhile stating this
>>>>>>>> out loud.
>>>>>>>> >>> > >>
>>>>>>>> >>> > >> Russell
>>>>>>>> >>> > >>
>>>>>>>> >>> > >> On Wed, Mar 12, 2025 at 11:05 PM Xiao Li <
>>>>>>>> gatorsm...@gmail.com> wrote:
>>>>>>>> >>> > >>
>>>>>>>> >>> > >>> this vote is to allow streaming queries which had been
>>>>>>>> ever run in Spark
>>>>>>>> >>> > >>>> 3.5.4 to be upgraded with Spark 4.0.x, "without having
>>>>>>>> to be upgraded with
>>>>>>>> >>> > >>>> Spark 3.5.5+ in prior".
>>>>>>>> >>> > >>>
>>>>>>>> >>> > >>>
>>>>>>>> >>> > >>> In the history of Apache Spark, have we ever required
>>>>>>>> users to upgrade
>>>>>>>> >>> > >>> to the next maintenance release before moving to a new
>>>>>>>> feature or major
>>>>>>>> >>> > >>> release?
>>>>>>>> >>> > >>>
>>>>>>>> >>> > >>> Xiao
>>>>>>>> >>> > >>>
>>>>>>>> >>> > >>> Adam Binford <adam...@gmail.com> 于2025年3月11日周二 09:08写道:
>>>>>>>> >>> > >>>
>>>>>>>> >>> > >>>> +1 (non-binding)
>>>>>>>> >>> > >>>>
>>>>>>>> >>> > >>>> It's a pretty in the weeds issue with how Structured
>>>>>>>> Streaming works
>>>>>>>> >>> > >>>> under the hood that's kinda hard to understand if you're
>>>>>>>> not familiar with
>>>>>>>> >>> > >>>> it. The migration logic doesn't mean users can still use
>>>>>>>> the old config,
>>>>>>>> >>> > >>>> it's purely behind the scenes to fix checkpoint metadata
>>>>>>>> in streams created
>>>>>>>> >>> > >>>> in 3.5.4. The 5 lines of code it takes to address a
>>>>>>>> weird edge case for
>>>>>>>> >>> > >>>> certain users that's already gone from master shouldn't
>>>>>>>> be a huge deal.
>>>>>>>> >>> > >>>>
>>>>>>>> >>> > >>>> On Tue, Mar 11, 2025 at 1:43 AM Yang Jie <
>>>>>>>> yangji...@apache.org> wrote:
>>>>>>>> >>> > >>>>
>>>>>>>> >>> > >>>>>
>>>>>>>> >>> > >>>>> To Sean, you're right, I'm very sorry.
>>>>>>>> >>> > >>>>>
>>>>>>>> >>> > >>>>> From the perspective of compatibility and
>>>>>>>> migratability, I think we
>>>>>>>> >>> > >>>>> should migrate this logic to 4.0.0 and keep it in the
>>>>>>>> codebase for a longer
>>>>>>>> >>> > >>>>> time (or permanently), because we can't predict which
>>>>>>>> version users of
>>>>>>>> >>> > >>>>> 3.5.4 will choose next.
>>>>>>>> >>> > >>>>>
>>>>>>>> >>> > >>>>>
>>>>>>>> >>> > >>>>> I don't want to discuss the so-called vendor issue.
>>>>>>>> >>> > >>>>>
>>>>>>>> >>> > >>>>> I withdraw my previous -1.
>>>>>>>> >>> > >>>>>
>>>>>>>> >>> > >>>>> Jie Yang.
>>>>>>>> >>> > >>>>>
>>>>>>>> >>> > >>>>> On 2025/03/11 04:42:25 Wenchen Fan wrote:
>>>>>>>> >>> > >>>>> > Guys, let’s be honest about what we’re discussing
>>>>>>>> here.
>>>>>>>> >>> > >>>>> >
>>>>>>>> >>> > >>>>> > If this is a migration issue, why would we even need
>>>>>>>> a vote? We’ve
>>>>>>>> >>> > >>>>> been
>>>>>>>> >>> > >>>>> > consistently adding configurations to restore legacy
>>>>>>>> behavior
>>>>>>>> >>> > >>>>> instead of
>>>>>>>> >>> > >>>>> > removing them because we understand the challenges of
>>>>>>>> upgrading Spark
>>>>>>>> >>> > >>>>> > versions. Our goal has always been to make upgrades
>>>>>>>> easier, even if
>>>>>>>> >>> > >>>>> it
>>>>>>>> >>> > >>>>> > means carrying some technical debt. I don’t think we
>>>>>>>> want to change
>>>>>>>> >>> > >>>>> that
>>>>>>>> >>> > >>>>> > culture now.
>>>>>>>> >>> > >>>>> >
>>>>>>>> >>> > >>>>> > If the concern is about vendor names appearing in the
>>>>>>>> codebase, then
>>>>>>>> >>> > >>>>> why is
>>>>>>>> >>> > >>>>> > it a big deal this time when vendor names are already
>>>>>>>> present
>>>>>>>> >>> > >>>>> elsewhere? If
>>>>>>>> >>> > >>>>> > we’ve failed to follow a policy, let’s correct it,
>>>>>>>> but can someone
>>>>>>>> >>> > >>>>> point to
>>>>>>>> >>> > >>>>> > the specific policy we’re violating?
>>>>>>>> >>> > >>>>> >
>>>>>>>> >>> > >>>>> > If the vote is about adding migration logic to ease
>>>>>>>> the upgrade from
>>>>>>>> >>> > >>>>> 3.5.4
>>>>>>>> >>> > >>>>> > to 4.0.0, then +1, why not?
>>>>>>>> >>> > >>>>> >
>>>>>>>> >>> > >>>>> > Thanks,
>>>>>>>> >>> > >>>>> > Wenchen
>>>>>>>> >>> > >>>>> >
>>>>>>>> >>> > >>>>> >
>>>>>>>> >>> > >>>>> >
>>>>>>>> >>> > >>>>> > On Mon, Mar 10, 2025 at 8:49 PM Jungtaek Lim <
>>>>>>>> >>> > >>>>> kabhwan.opensou...@gmail.com>
>>>>>>>> >>> > >>>>> > wrote:
>>>>>>>> >>> > >>>>> >
>>>>>>>> >>> > >>>>> > > Well said, Sean. Sorry I made you keep around here
>>>>>>>> since it might
>>>>>>>> >>> > >>>>> not be
>>>>>>>> >>> > >>>>> > > clearly stated. My bad.
>>>>>>>> >>> > >>>>> > >
>>>>>>>> >>> > >>>>> > > Yang, how could we ever tolerate the fact there are
>>>>>>>> "other"
>>>>>>>> >>> > >>>>> occurrences of
>>>>>>>> >>> > >>>>> > > vendor names in the codebase? Please go and search
>>>>>>>> "databricks" in
>>>>>>>> >>> > >>>>> the
>>>>>>>> >>> > >>>>> > > codebase and be surprised.
>>>>>>>> >>> > >>>>> > >
>>>>>>>> >>> > >>>>> > > If we believe that having vendor names in the
>>>>>>>> codebase will
>>>>>>>> >>> > >>>>> increase
>>>>>>>> >>> > >>>>> > > the occurrence of making mistakes, why didn't we
>>>>>>>> have a discussion
>>>>>>>> >>> > >>>>> thread
>>>>>>>> >>> > >>>>> > > earlier to remove all occurrences altogether? This
>>>>>>>> is super tricky
>>>>>>>> >>> > >>>>> because
>>>>>>>> >>> > >>>>> > > I can even start to argue we have "Apple" as a
>>>>>>>> vendor name in
>>>>>>>> >>> > >>>>> Apache Spark
>>>>>>>> >>> > >>>>> > > codebase. I'm not saying we use "apple" in the test
>>>>>>>> data. See
>>>>>>>> >>> > >>>>> > > `isMacOnAppleSilicon` in Utils. Is it unavoidable?
>>>>>>>> No,
>>>>>>>> >>> > >>>>> `isMacOnMSeries` or
>>>>>>>> >>> > >>>>> > > `isMacOnSilicon` is enough.
>>>>>>>> >>> > >>>>> > >
>>>>>>>> >>> > >>>>> > > We really need to draw a line where we disallow
>>>>>>>> vendor names on it
>>>>>>>> >>> > >>>>> - if
>>>>>>>> >>> > >>>>> > > it's the entire codebase, I don't really think it
>>>>>>>> is realistic.
>>>>>>>> >>> > >>>>> > >
>>>>>>>> >>> > >>>>> > > This was really a mistake, and it was definitely
>>>>>>>> not from
>>>>>>>> >>> > >>>>> referring to the
>>>>>>>> >>> > >>>>> > > existing codebase. Not having a vendor name does
>>>>>>>> not change
>>>>>>>> >>> > >>>>> anything on the
>>>>>>>> >>> > >>>>> > > chance of encountering this issue again. If we
>>>>>>>> really care, we
>>>>>>>> >>> > >>>>> should think
>>>>>>>> >>> > >>>>> > > about style checking, which is the only viable way
>>>>>>>> to catch the
>>>>>>>> >>> > >>>>> mistake.
>>>>>>>> >>> > >>>>> > > Again, I'd argue we have to have a bunch of vendor
>>>>>>>> names in that
>>>>>>>> >>> > >>>>> style
>>>>>>>> >>> > >>>>> > > check, not just the problematic vendor name.
>>>>>>>> >>> > >>>>> > >
>>>>>>>> >>> > >>>>> > >
>>>>>>>> >>> > >>>>> > > On Tue, Mar 11, 2025 at 12:17 PM Sean Owen <
>>>>>>>> sro...@gmail.com>
>>>>>>>> >>> > >>>>> wrote:
>>>>>>>> >>> > >>>>> > >
>>>>>>>> >>> > >>>>> > >> Doesn't the migration code 'clear' the debt?
>>>>>>>> >>> > >>>>> > >> The proposal is not to continue to support the
>>>>>>>> config.
>>>>>>>> >>> > >>>>> > >> I feel like people are not quite understanding the
>>>>>>>> change, and
>>>>>>>> >>> > >>>>> objecting
>>>>>>>> >>> > >>>>> > >> to something that doesn't exist.
>>>>>>>> >>> > >>>>> > >> It's a shame, as this seems like something not
>>>>>>>> even worth
>>>>>>>> >>> > >>>>> discussing. I
>>>>>>>> >>> > >>>>> > >> don't know why this triggered this much
>>>>>>>> discussion. We have kept
>>>>>>>> >>> > >>>>> deprecated
>>>>>>>> >>> > >>>>> > >> methods without blinking, which is in comparison
>>>>>>>> much bigger.
>>>>>>>> >>> > >>>>> > >> Can we maybe ask you review the actual change in
>>>>>>>> question?
>>>>>>>> >>> > >>>>> > >>
>>>>>>>> >>> > >>>>> > >> On Mon, Mar 10, 2025, 10:02 PM Yang Jie <
>>>>>>>> yangji...@apache.org>
>>>>>>>> >>> > >>>>> wrote:
>>>>>>>> >>> > >>>>> > >>
>>>>>>>> >>> > >>>>> > >>> -1
>>>>>>>> >>> > >>>>> > >>> Remove migration logic of incorrect
>>>>>>>> `spark.databricks.*`
>>>>>>>> >>> > >>>>> configuration
>>>>>>>> >>> > >>>>> > >>> in Spark 4.0.0 because I think this configuration
>>>>>>>> was initially
>>>>>>>> >>> > >>>>> introduced
>>>>>>>> >>> > >>>>> > >>> accidentally in Spark 3.5.4, lacking a clear
>>>>>>>> design intent.
>>>>>>>> >>> > >>>>> Although the
>>>>>>>> >>> > >>>>> > >>> immediate maintenance cost of retaining this
>>>>>>>> configuration
>>>>>>>> >>> > >>>>> currently seems
>>>>>>>> >>> > >>>>> > >>> limited, as subsequent versions iterate and user
>>>>>>>> habits form, it
>>>>>>>> >>> > >>>>> may lead
>>>>>>>> >>> > >>>>> > >>> to the continuous accumulation of technical debt.
>>>>>>>> When users
>>>>>>>> >>> > >>>>> come to view
>>>>>>>> >>> > >>>>> > >>> this configuration as one that can be relied on
>>>>>>>> long-term,
>>>>>>>> >>> > >>>>> future removal
>>>>>>>> >>> > >>>>> > >>> may face greater resistance from users and could
>>>>>>>> potentially
>>>>>>>> >>> > >>>>> become an
>>>>>>>> >>> > >>>>> > >>> entrenched and redundant configuration in the
>>>>>>>> codebase.
>>>>>>>> >>> > >>>>> Therefore, promptly
>>>>>>>> >>> > >>>>> > >>> correcting this historically accidental
>>>>>>>> configuration not only
>>>>>>>> >>> > >>>>> maintains
>>>>>>>> >>> > >>>>> > >>> the normativity of the Spark configuration system
>>>>>>>> but also
>>>>>>>> >>> > >>>>> prevents
>>>>>>>> >>> > >>>>> > >>> unintended configurations from becoming de facto
>>>>>>>> standards,
>>>>>>>> >>> > >>>>> thereby
>>>>>>>> >>> > >>>>> > >>> reducing long-term maintenance risks.
>>>>>>>> >>> > >>>>> > >>>
>>>>>>>> >>> > >>>>> > >>> Jie Yang
>>>>>>>> >>> > >>>>> > >>>
>>>>>>>> >>> > >>>>> > >>> On 2025/03/10 14:52:52 Dongjoon Hyun wrote:
>>>>>>>> >>> > >>>>> > >>> > -1 because there exists a feasible migration
>>>>>>>> path for Apache
>>>>>>>> >>> > >>>>> Spark
>>>>>>>> >>> > >>>>> > >>> 3.5.4 via Apache Spark 3.5.5.
>>>>>>>> >>> > >>>>> > >>> >
>>>>>>>> >>> > >>>>> > >>> > It's obvious that this Databricks' mistake
>>>>>>>> already causes a
>>>>>>>> >>> > >>>>> huge
>>>>>>>> >>> > >>>>> > >>> communication cost in the Apache Spark community
>>>>>>>> and is
>>>>>>>> >>> > >>>>> suggesting a burden
>>>>>>>> >>> > >>>>> > >>> to enforce us to handle at least two more PRs at
>>>>>>>> 4.0.0 and 4.1.0.
>>>>>>>> >>> > >>>>> > >>> >
>>>>>>>> >>> > >>>>> > >>> > Given that, I don't think
>>>>>>>> >>> > >>>>> > >>> > - This is an inevitable or
>>>>>>>> >>> > >>>>> > >>> > - This is 0 cost
>>>>>>>> >>> > >>>>> > >>> >
>>>>>>>> >>> > >>>>> > >>> > Dongjoon.
>>>>>>>> >>> > >>>>> > >>> >
>>>>>>>> >>> > >>>>> > >>> > On 2025/03/10 12:46:16 Jungtaek Lim wrote:
>>>>>>>> >>> > >>>>> > >>> > > Starting from my +1 (non-binding).
>>>>>>>> >>> > >>>>> > >>> > >
>>>>>>>> >>> > >>>>> > >>> > > In addition, I propose to retain migration
>>>>>>>> logic till Spark
>>>>>>>> >>> > >>>>> 4.1.x and
>>>>>>>> >>> > >>>>> > >>> > > remove it in Spark 4.2.0.
>>>>>>>> >>> > >>>>> > >>> > >
>>>>>>>> >>> > >>>>> > >>> > > On Mon, Mar 10, 2025 at 9:44 PM Jungtaek Lim <
>>>>>>>> >>> > >>>>> > >>> kabhwan.opensou...@gmail.com>
>>>>>>>> >>> > >>>>> > >>> > > wrote:
>>>>>>>> >>> > >>>>> > >>> > >
>>>>>>>> >>> > >>>>> > >>> > > > Hi dev,
>>>>>>>> >>> > >>>>> > >>> > > >
>>>>>>>> >>> > >>>>> > >>> > > > Please vote to retain migration logic of
>>>>>>>> incorrect
>>>>>>>> >>> > >>>>> > >>> `spark.databricks.*`
>>>>>>>> >>> > >>>>> > >>> > > > configuration in Spark 4.0.x.
>>>>>>>> >>> > >>>>> > >>> > > >
>>>>>>>> >>> > >>>>> > >>> > > > - DISCUSSION:
>>>>>>>> >>> > >>>>> > >>> > > >
>>>>>>>> >>> > >>>>>
>>>>>>>> https://lists.apache.org/thread/xzk9729lsmo397crdtk14f74g8cyv4sr
>>>>>>>> >>> > >>>>> > >>> > > > ([DISCUSS] Handling spark.databricks.*
>>>>>>>> config being
>>>>>>>> >>> > >>>>> exposed in
>>>>>>>> >>> > >>>>> > >>> 3.5.4 in
>>>>>>>> >>> > >>>>> > >>> > > > Spark 4.0.0+)
>>>>>>>> >>> > >>>>> > >>> > > >
>>>>>>>> >>> > >>>>> > >>> > > > Specifically, please review this post
>>>>>>>> >>> > >>>>> > >>> > > >
>>>>>>>> >>> > >>>>>
>>>>>>>> https://lists.apache.org/thread/xtq1kjhsl4ohfon78z3wld2hmfm78t9k
>>>>>>>> >>> > >>>>> > >>> which
>>>>>>>> >>> > >>>>> > >>> > > > explains pros and cons about the proposal -
>>>>>>>> proposal is
>>>>>>>> >>> > >>>>> about
>>>>>>>> >>> > >>>>> > >>> "Option 1".
>>>>>>>> >>> > >>>>> > >>> > > >
>>>>>>>> >>> > >>>>> > >>> > > > Simply speaking, this vote is to allow
>>>>>>>> streaming queries
>>>>>>>> >>> > >>>>> which had
>>>>>>>> >>> > >>>>> > >>> been
>>>>>>>> >>> > >>>>> > >>> > > > ever run in Spark 3.5.4 to be upgraded with
>>>>>>>> Spark 4.0.x,
>>>>>>>> >>> > >>>>> "without
>>>>>>>> >>> > >>>>> > >>> having to
>>>>>>>> >>> > >>>>> > >>> > > > be upgraded with Spark 3.5.5+ in prior". If
>>>>>>>> the vote
>>>>>>>> >>> > >>>>> passes, we
>>>>>>>> >>> > >>>>> > >>> will help
>>>>>>>> >>> > >>>>> > >>> > > > users to have a smooth upgrade from Spark
>>>>>>>> 3.5.4 to Spark
>>>>>>>> >>> > >>>>> 4.0.x,
>>>>>>>> >>> > >>>>> > >>> which would
>>>>>>>> >>> > >>>>> > >>> > > > be almost 1 year.
>>>>>>>> >>> > >>>>> > >>> > > >
>>>>>>>> >>> > >>>>> > >>> > > > The (only) cons in this option is having to
>>>>>>>> retain the
>>>>>>>> >>> > >>>>> incorrect
>>>>>>>> >>> > >>>>> > >>> > > > configuration name as "string" in the
>>>>>>>> codebase a bit
>>>>>>>> >>> > >>>>> longer. The
>>>>>>>> >>> > >>>>> > >>> code
>>>>>>>> >>> > >>>>> > >>> > > > complexity of migration logic is arguably
>>>>>>>> trivial. (link
>>>>>>>> >>> > >>>>> > >>> > > > <
>>>>>>>> >>> > >>>>> > >>>
>>>>>>>> >>> > >>>>>
>>>>>>>> https://github.com/apache/spark/blob/4231d58245251a34ae80a38ea4bbf7d720caa439/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeq.scala#L174-L183
>>>>>>>> >>> > >>>>> > >>> >
>>>>>>>> >>> > >>>>> > >>> > > > )
>>>>>>>> >>> > >>>>> > >>> > > >
>>>>>>>> >>> > >>>>> > >>> > > > This VOTE is for Spark 4.0.x, but if
>>>>>>>> someone supports
>>>>>>>> >>> > >>>>> including
>>>>>>>> >>> > >>>>> > >>> migration
>>>>>>>> >>> > >>>>> > >>> > > > logic to be longer than Spark 4.0.x, please
>>>>>>>> cast +1 here
>>>>>>>> >>> > >>>>> and leave
>>>>>>>> >>> > >>>>> > >>> the
>>>>>>>> >>> > >>>>> > >>> > > > desired last minor version of Spark to
>>>>>>>> retain this
>>>>>>>> >>> > >>>>> migration logic.
>>>>>>>> >>> > >>>>> > >>> > > >
>>>>>>>> >>> > >>>>> > >>> > > > The vote is open for the next 72 hours and
>>>>>>>> passes if a
>>>>>>>> >>> > >>>>> majority +1
>>>>>>>> >>> > >>>>> > >>> PMC
>>>>>>>> >>> > >>>>> > >>> > > > votes are cast, with a minimum of 3 +1
>>>>>>>> votes.
>>>>>>>> >>> > >>>>> > >>> > > >
>>>>>>>> >>> > >>>>> > >>> > > > [ ] +1 Retain migration logic of incorrect
>>>>>>>> >>> > >>>>> `spark.databricks.*`
>>>>>>>> >>> > >>>>> > >>> > > > configuration in Spark 4.0.x
>>>>>>>> >>> > >>>>> > >>> > > > [ ] -1 Remove migration logic of incorrect
>>>>>>>> >>> > >>>>> `spark.databricks.*`
>>>>>>>> >>> > >>>>> > >>> > > > configuration in Spark 4.0.0 because...
>>>>>>>> >>> > >>>>> > >>> > > >
>>>>>>>> >>> > >>>>> > >>> > > > Thanks!
>>>>>>>> >>> > >>>>> > >>> > > > Jungtaek Lim (HeartSaVioR)
>>>>>>>> >>> > >>>>> > >>> > > >
>>>>>>>> >>> > >>>>> > >>> > >
>>>>>>>> >>> > >>>>> > >>> >
>>>>>>>> >>> > >>>>> > >>> >
>>>>>>>> >>> > >>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >>> > >>>>> > >>> > To unsubscribe e-mail:
>>>>>>>> dev-unsubscr...@spark.apache.org
>>>>>>>> >>> > >>>>> > >>> >
>>>>>>>> >>> > >>>>> > >>> >
>>>>>>>> >>> > >>>>> > >>>
>>>>>>>> >>> > >>>>> > >>>
>>>>>>>> >>> > >>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >>> > >>>>> > >>> To unsubscribe e-mail:
>>>>>>>> dev-unsubscr...@spark.apache.org
>>>>>>>> >>> > >>>>> > >>>
>>>>>>>> >>> > >>>>> > >>>
>>>>>>>> >>> > >>>>> >
>>>>>>>> >>> > >>>>>
>>>>>>>> >>> > >>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >>> > >>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>> >>> > >>>>>
>>>>>>>> >>> > >>>>>
>>>>>>>> >>> > >>>>
>>>>>>>> >>> > >>>> --
>>>>>>>> >>> > >>>> Adam Binford
>>>>>>>> >>> > >>>>
>>>>>>>> >>> > >>>
>>>>>>>> >>> >
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>> >>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>>
>>>>>>>>

Reply via email to