Hi Holden,

> I think I disagree with Mark on the assertion that the veto needs to have
“substantial technical concern,” but rather a valid concern. I think in
addition to the veto they’ve also gone above and beyond providing
alternative ways to accomplish this.

I think this is not what the Apache policy says in
https://www.apache.org/foundation/glossary.html#Veto : "*All vetoes must be
accompanied by a valid technical justification; a veto without such a
justification is invalid.*"

I think it's better for the Apache Spark community to follow the general
Apache policy.

On Mon, Mar 17, 2025 at 8:49 AM Wenchen Fan <cloud0...@gmail.com> wrote:

> Before I cast my vote here, I'd like to highlight one thing: As the
> release manager of Apache Spark 4.0.0, I was not notified about the
> breaking change of renaming an already-released configuration:
> https://github.com/apache/spark/pull/49897 . Note that the previous VOTE
> from Dongjoon was about Apache Spark 3.5.5, which means there is no
> consensus about what we should do for 4.0.0 yet. I think it's fair for me
> to ask to revert the breaking change and unblock 4.0.0, which is a common
> practice of how we handle breaking changes in Apache Spark and I don't
> think I need a VOTE for it.
>
> Of course, none of us want to keep the misnamed configuration in 4.0.0,
> and it’s clear to me that applying the “configuration deprecation” approach
> from 3.5.5 to 4.0.0 is the best path forward. I don’t believe Dongjoon’s
> veto has valid technical justification, so I’m +1 on this vote.
>
> Thanks,
> Wenchen
>
>
> On Mon, Mar 17, 2025 at 7:27 AM Jungtaek Lim <kabhwan.opensou...@gmail.com>
> wrote:
>
>> I was trying hard to stay away from this VOTE, but I should have reminded
>> everyone about "what" we are going to VOTE.
>>
>> Dongjoon casted a VETO against code change VOTE. That VETO is described
>> in ASF Voting Process page:
>>
>> https://www.apache.org/foundation/voting.html#Veto
>>
>> A -1 vote by a qualified voter stops a code-modification proposal in its
>>> tracks. This constitutes a veto, and it cannot be overruled nor overridden
>>> by anyone. Vetoes stand until and unless the individual withdraws their
>>> veto.
>>>
>>> To prevent vetoes from being used capriciously, the voter must provide
>>> with the veto a technical justification showing why the change is bad
>>> (opens a security exposure, negatively affects performance, etc. ). A veto
>>> without a justification is invalid and has no weight.
>>
>>
>> For sure, the technical justification must be "objective", otherwise it
>> means if I'm a PMC member I can veto everything if I don't like it.
>>
>> The main argument here about "vendor name in the codebase" is NOT
>> something we have ever seen disallowing this in ASF policy. If there is
>> evidence, it will immediately kill the two VOTEs as it is enough objective
>> argument. But no one was able to bring this up. Please remember, the fact
>> "vendor name in the codebase is bad for any reason", is proven to be NOT an
>> "objective" claim, otherwise how the DISCUSS and VOTE were almost passing
>> with support from PMC members?
>>
>> I really suggest everyone who casts a vote in this VOTE thread, to be
>> based on "objective" rationale. For example, we tend to consider < 10 lines
>> of code to be very trivial to maintain, so the argument of maintenance
>> burden does not apply here. Like this.
>>
>>
>> On Sun, Mar 16, 2025 at 8:38 AM Mark Hamstra <markhams...@gmail.com>
>> wrote:
>>
>>> There has been enough discussion on this topic already, so I think
>>> that an immediate vote on the validity of Dongjoon's technical
>>> justification for his veto of the "Retain migration logic ... in Spark
>>> 4.0.x" proposal is in order. That technical justification has been
>>> called into question, and the guidance at
>>> https://www.apache.org/foundation/glossary.html#Veto leaves it to the
>>> PMC to determine whether the technical justification is  valid: "In
>>> case of doubt, deciding whether a technical justification is valid is
>>> up to the PMC." As such, only PMC votes will decide the outcome of
>>> this vote. This is neither a vote on a code change itself not a vote
>>> on whether a package is ready for release, so it a procedural vote on
>>> whether the technical justification is valid. As such, the vote will
>>> be decided by a simple majority where +1 votes hold that the technical
>>> justification is not valid and -1 votes hold that the technical
>>> justification is valid.
>>>
>>> I would request that at least PMC members post more than just a naked
>>> vote, but instead endeavor to give some reason why they have assessed
>>> the technical justification as they have. I'll start:
>>>
>>> Despite all of the discussion related to Dongjoon's -1 vote, I must
>>> confess to still not being entirely clear on what is his technical
>>> justification for that veto. I see claims that including an admonition
>>> in the Spark 4.0.x release notes that a prior upgrade to 3.5.5 is
>>> required to maintain the integrity of already existing data streams,
>>> and I see assertions about the maintenance burden that including the
>>> migration logic would impose on future Spark versions, but I don't
>>> think that I see any other technical objections. I do not believe that
>>> the claimed technical justification is valid.
>>>
>>> In requiring that a veto of a code change be accompanied by a
>>> technical justification for the veto, the Apache Voting Process states
>>> that: "To prevent vetoes from being used capriciously, the voter must
>>> provide with the veto a technical justification showing why the change
>>> is bad (opens a security exposure, negatively affects performance,
>>> etc. ). A veto without a justification is invalid and has no weight."
>>> This strongly implies that there must be something objectively wrong
>>> with the proposed code change in that it causes significant harm in
>>> the way of opening a security exposure, negatively affecting
>>> performance, or presumably other significant user harms or perhaps
>>> even developer burdens.
>>>
>>> The proposed addition of the migration logic to Spark 4.0.x does not
>>> cause any harm to Spark's users. For many users, those not using
>>> streaming data, the change will have no effect. For streaming users
>>> the change will be beneficial, not harmful.
>>>
>>> Neither do I find the claim of excessive, ongoing developer burden to
>>> be persuasive. The changes are tiny and easily maintained -- in fact,
>>> it wouldn't surprise me if no further changes to this migration logic
>>> would be needed for a very long time.
>>>
>>> Some of what we are left with is just an expression of preference for
>>> a technical alternative to the migration logic -- i.e. including in
>>> the release notes an admonition to first upgrade to 3.5.5. But the
>>> Apache Voting Process does not say that in the face of code
>>> alternatives A and B, a qualified voter is justified in vetoing A if
>>> they prefer B. Instead, the Voting Process strongly implies that
>>> something more is needed to justify a veto, as I've already covered.
>>> Thus I don't find Dongjoon's preference for the release notes option
>>> to be adequate justification for the veto.
>>>
>>> The only remaining question I see is whether including "databricks" in
>>> the Apache Code is ever allowed or if any such instance must be
>>> expunged as soon as possible. I am not aware of any ASF policy that
>>> strictly forbids the mention of a vendor in Apache code for any
>>> reason, even if that vendor has a product based on Apache code, even
>>> if that vendor enjoys a uniquely influential position vis a vis some
>>> Apache code or project. Certainly the PMC has a duty to see to it that
>>> neither Databricks nor any other vendor exercises influence or control
>>> over Apache Spark outside of the established Apache process, but the
>>> proposed migration code changes do not advantage Databricks -- if
>>> anything they remove a minor avenue of influence, and simply need to
>>> mention "databricks" once in order match and transform a configuration
>>> into a vendor neutral equivalent. While not optimal, I can't find such
>>> a one-time inclusion of "databricks" to be truly offensive to any
>>> non-technical policy concern -- certainly not offensive to the point
>>> that it outweighs the user advantage of including the migration logic
>>> in Spark 4.0.x.
>>>
>>> In summary, I do not find Dongjoon's given technical justification to
>>> be valid relative to the Apache requirements for a veto of a code
>>> change, so I must vote...
>>>
>>> +1
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>

Reply via email to