I'm delighted to see folks talking about a compromise. However, instead of just asking Dongjoon to withdraw the VETO perhaps folks can suggest alternatives that that might meet some of both parties goals?
On Sun, Mar 16, 2025 at 7:41 PM Wenchen Fan <cloud0...@gmail.com> wrote: > I agree with Holden that withdrawing a veto is always better than > overriding it: it's healthier for the community. Dongjoon, would you be > willing to reconsider your veto given the current as-is state of the 4.0.0 > release (the breaking change will be reverted)? > > On Mon, Mar 17, 2025 at 10:36 AM Wenchen Fan <cloud0...@gmail.com> wrote: > >> I've created the revert PR for branch-4.0: >> https://github.com/apache/spark/pull/50291 . We can merge PRs with lazy >> consensus but it's clear that this breaking change PR has failed to achieve >> consensus. >> >> I hope we now have a clear foundation for discussing solutions. As it >> stands, the misnamed configuration will be released in 4.0.0. I like >> Jungtaek’s proposal to deprecate it, but the decision is up to the >> community. >> >> On Mon, Mar 17, 2025 at 10:19 AM Jungtaek Lim < >> kabhwan.opensou...@gmail.com> wrote: >> >>> OK, let's be super honest. >>> >>> Again, I think you agree that *"both" proposals are "technically" >>> correct (or one side can't have a strong theoretical evidence to counter >>> the other side)*. So this naturally has a fate to have more supporters >>> to get to the end. It's very easy for me to VETO to his proposal (although >>> I don't have a binding vote, I think I have people who agree with me) if we >>> think we want to definitely expand the interpretation of VETO criteria in >>> the Apache Voting Process. >>> >>> You said it is up to the PMC member exercising the veto to use their >>> judgement, but definitely, it must not be used to force the community to >>> follow his proposal. The major argument here is, he can just VETO to any >>> proposal to retain the codebase as the way he prefers to, which I don't >>> believe is a correct usage of VETO. >>> >>> If we just revert the change of removal of config, this is "really" >>> neutral neither my proposal nor his proposal. Do we really want to do so? >>> >>> On Mon, Mar 17, 2025 at 10:55 AM Holden Karau <holden.ka...@gmail.com> >>> wrote: >>> >>>> First let me start with my key hope: >>>> >>>> We find a way to compromise and have the veto withdrawn rather than >>>> overridden. >>>> >>>> From what I understand of the change in question: >>>> >>>> So my understanding, and I may be over simplifying here but there are >>>> (at least) three technical paths forward (migration guide, legacy config >>>> with vendor string in it, non-vendor specific string legacy config), a PMC >>>> member vetoed one of them (named vendor legacy config) because he thought a >>>> different approach was better (migration guide) as they were worried that >>>> carrying that legacy config forward would encourage bad coding standards >>>> (eg we would add more vendor named config flags). To me that seems like a >>>> valid concern. >>>> >>>> My reasoning: >>>> >>>> Thinking back at other VETOs that I’ve been involved with in this >>>> project (DSV2, graceful decom, etc) this seems to meet the same bar. Hell >>>> we’ve had plenty of vetos that didn’t offer an alternative. >>>> >>>> My personal understanding of where the bar for “ >>>> a technical justification showing why the change is bad” concern is >>>> pretty much “any not factually incorrect reasoning”, the text doesn’t have >>>> any particular “bar” for the level of “badness” and I think it’s up to the >>>> PMC member exercising the veto to use their judgement. >>>> >>>> In closing, I feel like the path we’re going down (overriding a veto) >>>> is not healthy for the project. >>>> >>>> Twitter: https://twitter.com/holdenkarau >>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>> Books (Learning Spark, High Performance Spark, etc.): >>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>> Pronouns: she/her >>>> >>>> >>>> On Sun, Mar 16, 2025 at 6:28 PM Jungtaek Lim < >>>> kabhwan.opensou...@gmail.com> wrote: >>>> >>>>> Holden, I believe you should already know "both" approaches are >>>>> "technically" correct. It's not about which one you have a preference for, >>>>> no, this VOTE is not intended to extend the debate. >>>>> >>>>> Again, what you are encouraged to do here is, not exposing your >>>>> preference of two approaches, but exposing your "technically valid" >>>>> concern >>>>> of my approach, backed by Dongjoon's veto (most likely you want to quote >>>>> Dongjoon's post). This is very simple and I'm not sure you are doing >>>>> exactly what the VOTE requires. >>>>> >>>>> On Mon, Mar 17, 2025 at 6:32 AM Holden Karau <holden.ka...@gmail.com> >>>>> wrote: >>>>> >>>>>> -1 (binding) — to me it doesn’t matter that the cost is low if the >>>>>> objection is technical then I think we need to respect the veto. There >>>>>> is a >>>>>> fundamental disagreement as to what the correct technical way to address >>>>>> this problem is (removal + documentation vs legacy config) and a PMC >>>>>> member >>>>>> has vetoed the legacy config option. >>>>>> >>>>>> I think I disagree with Mark on the assertion that the veto needs to >>>>>> have “substantial technical concern,” but rather a valid concern. I think >>>>>> in addition to the veto they’ve also gone above and beyond providing >>>>>> alternative ways to accomplish this. >>>>>> >>>>>> On a personal level: >>>>>> >>>>>> I am optimistic we can unblock the release but I think it’s important >>>>>> to err on the side of respecting the veto here in the interest of >>>>>> perceived >>>>>> fairness *especially* because of vendor aspects. >>>>>> >>>>>> To be clear I’ve worked at most of these companies (and many of the >>>>>> people) and I’m not ascribing malice to anyone in this, I think mistakes >>>>>> happen (god knows I’ve had a fair share). I think we’re all doing our >>>>>> best >>>>>> here and would ask that we show everyone understanding regardless of the >>>>>> outcome. >>>>>> >>>>>> Sending hugs and good vibes to y’all. >>>>>> >>>>>> Twitter: https://twitter.com/holdenkarau >>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>> Pronouns: she/her >>>>>> >>>>>> >>>>>> On Sat, Mar 15, 2025 at 5:07 PM Holden Karau <holden.ka...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Given it’s the weekend maybe let’s give folks at least one full work >>>>>>> day. >>>>>>> >>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>>> Pronouns: she/her >>>>>>> >>>>>>> >>>>>>> On Sat, Mar 15, 2025 at 4:44 PM Mark Hamstra <markhams...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Quick administrative note: I don't see any reason why this vote >>>>>>>> should >>>>>>>> take a long time, so I expect to close the process and tally the >>>>>>>> votes >>>>>>>> in not much more than 48 hours. >>>>>>>> >>>>>>>> On Sat, Mar 15, 2025 at 4:35 PM Mark Hamstra <markhams...@gmail.com> >>>>>>>> wrote: >>>>>>>> > >>>>>>>> > There has been enough discussion on this topic already, so I think >>>>>>>> > that an immediate vote on the validity of Dongjoon's technical >>>>>>>> > justification for his veto of the "Retain migration logic ... in >>>>>>>> Spark >>>>>>>> > 4.0.x" proposal is in order. That technical justification has been >>>>>>>> > called into question, and the guidance at >>>>>>>> > https://www.apache.org/foundation/glossary.html#Veto leaves it >>>>>>>> to the >>>>>>>> > PMC to determine whether the technical justification is valid: >>>>>>>> "In >>>>>>>> > case of doubt, deciding whether a technical justification is >>>>>>>> valid is >>>>>>>> > up to the PMC." As such, only PMC votes will decide the outcome of >>>>>>>> > this vote. This is neither a vote on a code change itself not a >>>>>>>> vote >>>>>>>> > on whether a package is ready for release, so it a procedural >>>>>>>> vote on >>>>>>>> > whether the technical justification is valid. As such, the vote >>>>>>>> will >>>>>>>> > be decided by a simple majority where +1 votes hold that the >>>>>>>> technical >>>>>>>> > justification is not valid and -1 votes hold that the technical >>>>>>>> > justification is valid. >>>>>>>> > >>>>>>>> > I would request that at least PMC members post more than just a >>>>>>>> naked >>>>>>>> > vote, but instead endeavor to give some reason why they have >>>>>>>> assessed >>>>>>>> > the technical justification as they have. I'll start: >>>>>>>> > >>>>>>>> > Despite all of the discussion related to Dongjoon's -1 vote, I >>>>>>>> must >>>>>>>> > confess to still not being entirely clear on what is his technical >>>>>>>> > justification for that veto. I see claims that including an >>>>>>>> admonition >>>>>>>> > in the Spark 4.0.x release notes that a prior upgrade to 3.5.5 is >>>>>>>> > required to maintain the integrity of already existing data >>>>>>>> streams, >>>>>>>> > and I see assertions about the maintenance burden that including >>>>>>>> the >>>>>>>> > migration logic would impose on future Spark versions, but I don't >>>>>>>> > think that I see any other technical objections. I do not believe >>>>>>>> that >>>>>>>> > the claimed technical justification is valid. >>>>>>>> > >>>>>>>> > In requiring that a veto of a code change be accompanied by a >>>>>>>> > technical justification for the veto, the Apache Voting Process >>>>>>>> states >>>>>>>> > that: "To prevent vetoes from being used capriciously, the voter >>>>>>>> must >>>>>>>> > provide with the veto a technical justification showing why the >>>>>>>> change >>>>>>>> > is bad (opens a security exposure, negatively affects performance, >>>>>>>> > etc. ). A veto without a justification is invalid and has no >>>>>>>> weight." >>>>>>>> > This strongly implies that there must be something objectively >>>>>>>> wrong >>>>>>>> > with the proposed code change in that it causes significant harm >>>>>>>> in >>>>>>>> > the way of opening a security exposure, negatively affecting >>>>>>>> > performance, or presumably other significant user harms or perhaps >>>>>>>> > even developer burdens. >>>>>>>> > >>>>>>>> > The proposed addition of the migration logic to Spark 4.0.x does >>>>>>>> not >>>>>>>> > cause any harm to Spark's users. For many users, those not using >>>>>>>> > streaming data, the change will have no effect. For streaming >>>>>>>> users >>>>>>>> > the change will be beneficial, not harmful. >>>>>>>> > >>>>>>>> > Neither do I find the claim of excessive, ongoing developer >>>>>>>> burden to >>>>>>>> > be persuasive. The changes are tiny and easily maintained -- in >>>>>>>> fact, >>>>>>>> > it wouldn't surprise me if no further changes to this migration >>>>>>>> logic >>>>>>>> > would be needed for a very long time. >>>>>>>> > >>>>>>>> > Some of what we are left with is just an expression of preference >>>>>>>> for >>>>>>>> > a technical alternative to the migration logic -- i.e. including >>>>>>>> in >>>>>>>> > the release notes an admonition to first upgrade to 3.5.5. But the >>>>>>>> > Apache Voting Process does not say that in the face of code >>>>>>>> > alternatives A and B, a qualified voter is justified in vetoing A >>>>>>>> if >>>>>>>> > they prefer B. Instead, the Voting Process strongly implies that >>>>>>>> > something more is needed to justify a veto, as I've already >>>>>>>> covered. >>>>>>>> > Thus I don't find Dongjoon's preference for the release notes >>>>>>>> option >>>>>>>> > to be adequate justification for the veto. >>>>>>>> > >>>>>>>> > The only remaining question I see is whether including >>>>>>>> "databricks" in >>>>>>>> > the Apache Code is ever allowed or if any such instance must be >>>>>>>> > expunged as soon as possible. I am not aware of any ASF policy >>>>>>>> that >>>>>>>> > strictly forbids the mention of a vendor in Apache code for any >>>>>>>> > reason, even if that vendor has a product based on Apache code, >>>>>>>> even >>>>>>>> > if that vendor enjoys a uniquely influential position vis a vis >>>>>>>> some >>>>>>>> > Apache code or project. Certainly the PMC has a duty to see to it >>>>>>>> that >>>>>>>> > neither Databricks nor any other vendor exercises influence or >>>>>>>> control >>>>>>>> > over Apache Spark outside of the established Apache process, but >>>>>>>> the >>>>>>>> > proposed migration code changes do not advantage Databricks -- if >>>>>>>> > anything they remove a minor avenue of influence, and simply need >>>>>>>> to >>>>>>>> > mention "databricks" once in order match and transform a >>>>>>>> configuration >>>>>>>> > into a vendor neutral equivalent. While not optimal, I can't find >>>>>>>> such >>>>>>>> > a one-time inclusion of "databricks" to be truly offensive to any >>>>>>>> > non-technical policy concern -- certainly not offensive to the >>>>>>>> point >>>>>>>> > that it outweighs the user advantage of including the migration >>>>>>>> logic >>>>>>>> > in Spark 4.0.x. >>>>>>>> > >>>>>>>> > In summary, I do not find Dongjoon's given technical >>>>>>>> justification to >>>>>>>> > be valid relative to the Apache requirements for a veto of a code >>>>>>>> > change, so I must vote... >>>>>>>> > >>>>>>>> > +1 >>>>>>>> >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>>> >>>>>>>> -- Twitter: https://twitter.com/holdenkarau Fight Health Insurance: https://www.fighthealthinsurance.com/ <https://www.fighthealthinsurance.com/?q=hk_email> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau Pronouns: she/her