I'm delighted to see folks talking about a compromise. However, instead of
just asking Dongjoon to withdraw the VETO perhaps folks can suggest
alternatives that that might meet some of both parties goals?

On Sun, Mar 16, 2025 at 7:41 PM Wenchen Fan <cloud0...@gmail.com> wrote:

> I agree with Holden that withdrawing a veto is always better than
> overriding it: it's healthier for the community. Dongjoon, would you be
> willing to reconsider your veto given the current as-is state of the 4.0.0
> release (the breaking change will be reverted)?
>
> On Mon, Mar 17, 2025 at 10:36 AM Wenchen Fan <cloud0...@gmail.com> wrote:
>
>> I've created the revert PR for branch-4.0:
>> https://github.com/apache/spark/pull/50291 . We can merge PRs with lazy
>> consensus but it's clear that this breaking change PR has failed to achieve
>> consensus.
>>
>> I hope we now have a clear foundation for discussing solutions. As it
>> stands, the misnamed configuration will be released in 4.0.0. I like
>> Jungtaek’s proposal to deprecate it, but the decision is up to the
>> community.
>>
>> On Mon, Mar 17, 2025 at 10:19 AM Jungtaek Lim <
>> kabhwan.opensou...@gmail.com> wrote:
>>
>>> OK, let's be super honest.
>>>
>>> Again, I think you agree that *"both" proposals are "technically"
>>> correct (or one side can't have a strong theoretical evidence to counter
>>> the other side)*. So this naturally has a fate to have more supporters
>>> to get to the end. It's very easy for me to VETO to his proposal (although
>>> I don't have a binding vote, I think I have people who agree with me) if we
>>> think we want to definitely expand the interpretation of VETO criteria in
>>> the Apache Voting Process.
>>>
>>> You said it is up to the PMC member exercising the veto to use their
>>> judgement, but definitely, it must not be used to force the community to
>>> follow his proposal. The major argument here is, he can just VETO to any
>>> proposal to retain the codebase as the way he prefers to, which I don't
>>> believe is a correct usage of VETO.
>>>
>>> If we just revert the change of removal of config, this is "really"
>>> neutral neither my proposal nor his proposal. Do we really want to do so?
>>>
>>> On Mon, Mar 17, 2025 at 10:55 AM Holden Karau <holden.ka...@gmail.com>
>>> wrote:
>>>
>>>> First let me start with my key hope:
>>>>
>>>> We find a way to compromise and have the veto withdrawn rather than
>>>> overridden.
>>>>
>>>> From what I understand of the change in question:
>>>>
>>>> So my understanding, and I may be over simplifying here but there are
>>>> (at least) three technical paths forward (migration guide, legacy config
>>>> with vendor string in it, non-vendor specific string legacy config), a PMC
>>>> member vetoed one of them (named vendor legacy config) because he thought a
>>>> different approach was better (migration guide) as they were worried that
>>>> carrying that legacy config forward would encourage bad coding standards
>>>> (eg we would add more vendor named config flags). To me that seems like a
>>>> valid concern.
>>>>
>>>> My reasoning:
>>>>
>>>> Thinking back at other VETOs that I’ve been involved with in this
>>>> project (DSV2, graceful decom, etc) this seems to meet the same bar. Hell
>>>> we’ve had plenty of vetos that didn’t offer an alternative.
>>>>
>>>> My personal understanding of where the bar for “
>>>> a technical justification showing why the change is bad” concern is
>>>> pretty much “any not factually incorrect reasoning”, the text doesn’t have
>>>> any particular “bar” for the level of “badness” and I think it’s up to the
>>>> PMC member exercising the veto to use their judgement.
>>>>
>>>> In closing, I feel like the path we’re going down (overriding a veto)
>>>> is not healthy for the project.
>>>>
>>>> Twitter: https://twitter.com/holdenkarau
>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>> Pronouns: she/her
>>>>
>>>>
>>>> On Sun, Mar 16, 2025 at 6:28 PM Jungtaek Lim <
>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>
>>>>> Holden, I believe you should already know "both" approaches are
>>>>> "technically" correct. It's not about which one you have a preference for,
>>>>> no, this VOTE is not intended to extend the debate.
>>>>>
>>>>> Again, what you are encouraged to do here is, not exposing your
>>>>> preference of two approaches, but exposing your "technically valid" 
>>>>> concern
>>>>> of my approach, backed by Dongjoon's veto (most likely you want to quote
>>>>> Dongjoon's post). This is very simple and I'm not sure you are doing
>>>>> exactly what the VOTE requires.
>>>>>
>>>>> On Mon, Mar 17, 2025 at 6:32 AM Holden Karau <holden.ka...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> -1 (binding) — to me it doesn’t matter that the cost is low if the
>>>>>> objection is technical then I think we need to respect the veto. There 
>>>>>> is a
>>>>>> fundamental disagreement as to what the correct technical way to address
>>>>>> this problem is (removal + documentation vs legacy config) and a PMC 
>>>>>> member
>>>>>> has vetoed  the legacy config option.
>>>>>>
>>>>>> I think I disagree with Mark on the assertion that the veto needs to
>>>>>> have “substantial technical concern,” but rather a valid concern. I think
>>>>>> in addition to the veto they’ve also gone above and beyond providing
>>>>>> alternative ways to accomplish this.
>>>>>>
>>>>>> On a personal level:
>>>>>>
>>>>>> I am optimistic we can unblock the release but I think it’s important
>>>>>> to err on the side of respecting the veto here in the interest of 
>>>>>> perceived
>>>>>> fairness *especially* because of vendor aspects.
>>>>>>
>>>>>> To be clear I’ve worked at most of these companies (and many of the
>>>>>> people) and I’m not ascribing malice to anyone in this, I think mistakes
>>>>>> happen (god knows I’ve had a fair share). I think we’re all doing our 
>>>>>> best
>>>>>> here and would ask that we show everyone understanding regardless of the
>>>>>> outcome.
>>>>>>
>>>>>> Sending hugs and good vibes to y’all.
>>>>>>
>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>> Pronouns: she/her
>>>>>>
>>>>>>
>>>>>> On Sat, Mar 15, 2025 at 5:07 PM Holden Karau <holden.ka...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Given it’s the weekend maybe let’s give folks at least one full work
>>>>>>> day.
>>>>>>>
>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>> Pronouns: she/her
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Mar 15, 2025 at 4:44 PM Mark Hamstra <markhams...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Quick administrative note: I don't see any reason why this vote
>>>>>>>> should
>>>>>>>> take a long time, so I expect to close the process and tally the
>>>>>>>> votes
>>>>>>>> in not much more than 48 hours.
>>>>>>>>
>>>>>>>> On Sat, Mar 15, 2025 at 4:35 PM Mark Hamstra <markhams...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> >
>>>>>>>> > There has been enough discussion on this topic already, so I think
>>>>>>>> > that an immediate vote on the validity of Dongjoon's technical
>>>>>>>> > justification for his veto of the "Retain migration logic ... in
>>>>>>>> Spark
>>>>>>>> > 4.0.x" proposal is in order. That technical justification has been
>>>>>>>> > called into question, and the guidance at
>>>>>>>> > https://www.apache.org/foundation/glossary.html#Veto leaves it
>>>>>>>> to the
>>>>>>>> > PMC to determine whether the technical justification is  valid:
>>>>>>>> "In
>>>>>>>> > case of doubt, deciding whether a technical justification is
>>>>>>>> valid is
>>>>>>>> > up to the PMC." As such, only PMC votes will decide the outcome of
>>>>>>>> > this vote. This is neither a vote on a code change itself not a
>>>>>>>> vote
>>>>>>>> > on whether a package is ready for release, so it a procedural
>>>>>>>> vote on
>>>>>>>> > whether the technical justification is valid. As such, the vote
>>>>>>>> will
>>>>>>>> > be decided by a simple majority where +1 votes hold that the
>>>>>>>> technical
>>>>>>>> > justification is not valid and -1 votes hold that the technical
>>>>>>>> > justification is valid.
>>>>>>>> >
>>>>>>>> > I would request that at least PMC members post more than just a
>>>>>>>> naked
>>>>>>>> > vote, but instead endeavor to give some reason why they have
>>>>>>>> assessed
>>>>>>>> > the technical justification as they have. I'll start:
>>>>>>>> >
>>>>>>>> > Despite all of the discussion related to Dongjoon's -1 vote, I
>>>>>>>> must
>>>>>>>> > confess to still not being entirely clear on what is his technical
>>>>>>>> > justification for that veto. I see claims that including an
>>>>>>>> admonition
>>>>>>>> > in the Spark 4.0.x release notes that a prior upgrade to 3.5.5 is
>>>>>>>> > required to maintain the integrity of already existing data
>>>>>>>> streams,
>>>>>>>> > and I see assertions about the maintenance burden that including
>>>>>>>> the
>>>>>>>> > migration logic would impose on future Spark versions, but I don't
>>>>>>>> > think that I see any other technical objections. I do not believe
>>>>>>>> that
>>>>>>>> > the claimed technical justification is valid.
>>>>>>>> >
>>>>>>>> > In requiring that a veto of a code change be accompanied by a
>>>>>>>> > technical justification for the veto, the Apache Voting Process
>>>>>>>> states
>>>>>>>> > that: "To prevent vetoes from being used capriciously, the voter
>>>>>>>> must
>>>>>>>> > provide with the veto a technical justification showing why the
>>>>>>>> change
>>>>>>>> > is bad (opens a security exposure, negatively affects performance,
>>>>>>>> > etc. ). A veto without a justification is invalid and has no
>>>>>>>> weight."
>>>>>>>> > This strongly implies that there must be something objectively
>>>>>>>> wrong
>>>>>>>> > with the proposed code change in that it causes significant harm
>>>>>>>> in
>>>>>>>> > the way of opening a security exposure, negatively affecting
>>>>>>>> > performance, or presumably other significant user harms or perhaps
>>>>>>>> > even developer burdens.
>>>>>>>> >
>>>>>>>> > The proposed addition of the migration logic to Spark 4.0.x does
>>>>>>>> not
>>>>>>>> > cause any harm to Spark's users. For many users, those not using
>>>>>>>> > streaming data, the change will have no effect. For streaming
>>>>>>>> users
>>>>>>>> > the change will be beneficial, not harmful.
>>>>>>>> >
>>>>>>>> > Neither do I find the claim of excessive, ongoing developer
>>>>>>>> burden to
>>>>>>>> > be persuasive. The changes are tiny and easily maintained -- in
>>>>>>>> fact,
>>>>>>>> > it wouldn't surprise me if no further changes to this migration
>>>>>>>> logic
>>>>>>>> > would be needed for a very long time.
>>>>>>>> >
>>>>>>>> > Some of what we are left with is just an expression of preference
>>>>>>>> for
>>>>>>>> > a technical alternative to the migration logic -- i.e. including
>>>>>>>> in
>>>>>>>> > the release notes an admonition to first upgrade to 3.5.5. But the
>>>>>>>> > Apache Voting Process does not say that in the face of code
>>>>>>>> > alternatives A and B, a qualified voter is justified in vetoing A
>>>>>>>> if
>>>>>>>> > they prefer B. Instead, the Voting Process strongly implies that
>>>>>>>> > something more is needed to justify a veto, as I've already
>>>>>>>> covered.
>>>>>>>> > Thus I don't find Dongjoon's preference for the release notes
>>>>>>>> option
>>>>>>>> > to be adequate justification for the veto.
>>>>>>>> >
>>>>>>>> > The only remaining question I see is whether including
>>>>>>>> "databricks" in
>>>>>>>> > the Apache Code is ever allowed or if any such instance must be
>>>>>>>> > expunged as soon as possible. I am not aware of any ASF policy
>>>>>>>> that
>>>>>>>> > strictly forbids the mention of a vendor in Apache code for any
>>>>>>>> > reason, even if that vendor has a product based on Apache code,
>>>>>>>> even
>>>>>>>> > if that vendor enjoys a uniquely influential position vis a vis
>>>>>>>> some
>>>>>>>> > Apache code or project. Certainly the PMC has a duty to see to it
>>>>>>>> that
>>>>>>>> > neither Databricks nor any other vendor exercises influence or
>>>>>>>> control
>>>>>>>> > over Apache Spark outside of the established Apache process, but
>>>>>>>> the
>>>>>>>> > proposed migration code changes do not advantage Databricks -- if
>>>>>>>> > anything they remove a minor avenue of influence, and simply need
>>>>>>>> to
>>>>>>>> > mention "databricks" once in order match and transform a
>>>>>>>> configuration
>>>>>>>> > into a vendor neutral equivalent. While not optimal, I can't find
>>>>>>>> such
>>>>>>>> > a one-time inclusion of "databricks" to be truly offensive to any
>>>>>>>> > non-technical policy concern -- certainly not offensive to the
>>>>>>>> point
>>>>>>>> > that it outweighs the user advantage of including the migration
>>>>>>>> logic
>>>>>>>> > in Spark 4.0.x.
>>>>>>>> >
>>>>>>>> > In summary, I do not find Dongjoon's given technical
>>>>>>>> justification to
>>>>>>>> > be valid relative to the Apache requirements for a veto of a code
>>>>>>>> > change, so I must vote...
>>>>>>>> >
>>>>>>>> > +1
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>>
>>>>>>>>

-- 
Twitter: https://twitter.com/holdenkarau
Fight Health Insurance: https://www.fighthealthinsurance.com/
<https://www.fighthealthinsurance.com/?q=hk_email>
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau
Pronouns: she/her

Reply via email to