I agree with Holden that withdrawing a veto is always better than overriding it: it's healthier for the community. Dongjoon, would you be willing to reconsider your veto given the current as-is state of the 4.0.0 release (the breaking change will be reverted)?
On Mon, Mar 17, 2025 at 10:36 AM Wenchen Fan <cloud0...@gmail.com> wrote: > I've created the revert PR for branch-4.0: > https://github.com/apache/spark/pull/50291 . We can merge PRs with lazy > consensus but it's clear that this breaking change PR has failed to achieve > consensus. > > I hope we now have a clear foundation for discussing solutions. As it > stands, the misnamed configuration will be released in 4.0.0. I like > Jungtaek’s proposal to deprecate it, but the decision is up to the > community. > > On Mon, Mar 17, 2025 at 10:19 AM Jungtaek Lim < > kabhwan.opensou...@gmail.com> wrote: > >> OK, let's be super honest. >> >> Again, I think you agree that *"both" proposals are "technically" >> correct (or one side can't have a strong theoretical evidence to counter >> the other side)*. So this naturally has a fate to have more supporters >> to get to the end. It's very easy for me to VETO to his proposal (although >> I don't have a binding vote, I think I have people who agree with me) if we >> think we want to definitely expand the interpretation of VETO criteria in >> the Apache Voting Process. >> >> You said it is up to the PMC member exercising the veto to use their >> judgement, but definitely, it must not be used to force the community to >> follow his proposal. The major argument here is, he can just VETO to any >> proposal to retain the codebase as the way he prefers to, which I don't >> believe is a correct usage of VETO. >> >> If we just revert the change of removal of config, this is "really" >> neutral neither my proposal nor his proposal. Do we really want to do so? >> >> On Mon, Mar 17, 2025 at 10:55 AM Holden Karau <holden.ka...@gmail.com> >> wrote: >> >>> First let me start with my key hope: >>> >>> We find a way to compromise and have the veto withdrawn rather than >>> overridden. >>> >>> From what I understand of the change in question: >>> >>> So my understanding, and I may be over simplifying here but there are >>> (at least) three technical paths forward (migration guide, legacy config >>> with vendor string in it, non-vendor specific string legacy config), a PMC >>> member vetoed one of them (named vendor legacy config) because he thought a >>> different approach was better (migration guide) as they were worried that >>> carrying that legacy config forward would encourage bad coding standards >>> (eg we would add more vendor named config flags). To me that seems like a >>> valid concern. >>> >>> My reasoning: >>> >>> Thinking back at other VETOs that I’ve been involved with in this >>> project (DSV2, graceful decom, etc) this seems to meet the same bar. Hell >>> we’ve had plenty of vetos that didn’t offer an alternative. >>> >>> My personal understanding of where the bar for “ >>> a technical justification showing why the change is bad” concern is >>> pretty much “any not factually incorrect reasoning”, the text doesn’t have >>> any particular “bar” for the level of “badness” and I think it’s up to the >>> PMC member exercising the veto to use their judgement. >>> >>> In closing, I feel like the path we’re going down (overriding a veto) is >>> not healthy for the project. >>> >>> Twitter: https://twitter.com/holdenkarau >>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>> <https://www.fighthealthinsurance.com/?q=hk_email> >>> Books (Learning Spark, High Performance Spark, etc.): >>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>> Pronouns: she/her >>> >>> >>> On Sun, Mar 16, 2025 at 6:28 PM Jungtaek Lim < >>> kabhwan.opensou...@gmail.com> wrote: >>> >>>> Holden, I believe you should already know "both" approaches are >>>> "technically" correct. It's not about which one you have a preference for, >>>> no, this VOTE is not intended to extend the debate. >>>> >>>> Again, what you are encouraged to do here is, not exposing your >>>> preference of two approaches, but exposing your "technically valid" concern >>>> of my approach, backed by Dongjoon's veto (most likely you want to quote >>>> Dongjoon's post). This is very simple and I'm not sure you are doing >>>> exactly what the VOTE requires. >>>> >>>> On Mon, Mar 17, 2025 at 6:32 AM Holden Karau <holden.ka...@gmail.com> >>>> wrote: >>>> >>>>> -1 (binding) — to me it doesn’t matter that the cost is low if the >>>>> objection is technical then I think we need to respect the veto. There is >>>>> a >>>>> fundamental disagreement as to what the correct technical way to address >>>>> this problem is (removal + documentation vs legacy config) and a PMC >>>>> member >>>>> has vetoed the legacy config option. >>>>> >>>>> I think I disagree with Mark on the assertion that the veto needs to >>>>> have “substantial technical concern,” but rather a valid concern. I think >>>>> in addition to the veto they’ve also gone above and beyond providing >>>>> alternative ways to accomplish this. >>>>> >>>>> On a personal level: >>>>> >>>>> I am optimistic we can unblock the release but I think it’s important >>>>> to err on the side of respecting the veto here in the interest of >>>>> perceived >>>>> fairness *especially* because of vendor aspects. >>>>> >>>>> To be clear I’ve worked at most of these companies (and many of the >>>>> people) and I’m not ascribing malice to anyone in this, I think mistakes >>>>> happen (god knows I’ve had a fair share). I think we’re all doing our best >>>>> here and would ask that we show everyone understanding regardless of the >>>>> outcome. >>>>> >>>>> Sending hugs and good vibes to y’all. >>>>> >>>>> Twitter: https://twitter.com/holdenkarau >>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>> Pronouns: she/her >>>>> >>>>> >>>>> On Sat, Mar 15, 2025 at 5:07 PM Holden Karau <holden.ka...@gmail.com> >>>>> wrote: >>>>> >>>>>> Given it’s the weekend maybe let’s give folks at least one full work >>>>>> day. >>>>>> >>>>>> Twitter: https://twitter.com/holdenkarau >>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>> Pronouns: she/her >>>>>> >>>>>> >>>>>> On Sat, Mar 15, 2025 at 4:44 PM Mark Hamstra <markhams...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Quick administrative note: I don't see any reason why this vote >>>>>>> should >>>>>>> take a long time, so I expect to close the process and tally the >>>>>>> votes >>>>>>> in not much more than 48 hours. >>>>>>> >>>>>>> On Sat, Mar 15, 2025 at 4:35 PM Mark Hamstra <markhams...@gmail.com> >>>>>>> wrote: >>>>>>> > >>>>>>> > There has been enough discussion on this topic already, so I think >>>>>>> > that an immediate vote on the validity of Dongjoon's technical >>>>>>> > justification for his veto of the "Retain migration logic ... in >>>>>>> Spark >>>>>>> > 4.0.x" proposal is in order. That technical justification has been >>>>>>> > called into question, and the guidance at >>>>>>> > https://www.apache.org/foundation/glossary.html#Veto leaves it to >>>>>>> the >>>>>>> > PMC to determine whether the technical justification is valid: "In >>>>>>> > case of doubt, deciding whether a technical justification is valid >>>>>>> is >>>>>>> > up to the PMC." As such, only PMC votes will decide the outcome of >>>>>>> > this vote. This is neither a vote on a code change itself not a >>>>>>> vote >>>>>>> > on whether a package is ready for release, so it a procedural vote >>>>>>> on >>>>>>> > whether the technical justification is valid. As such, the vote >>>>>>> will >>>>>>> > be decided by a simple majority where +1 votes hold that the >>>>>>> technical >>>>>>> > justification is not valid and -1 votes hold that the technical >>>>>>> > justification is valid. >>>>>>> > >>>>>>> > I would request that at least PMC members post more than just a >>>>>>> naked >>>>>>> > vote, but instead endeavor to give some reason why they have >>>>>>> assessed >>>>>>> > the technical justification as they have. I'll start: >>>>>>> > >>>>>>> > Despite all of the discussion related to Dongjoon's -1 vote, I must >>>>>>> > confess to still not being entirely clear on what is his technical >>>>>>> > justification for that veto. I see claims that including an >>>>>>> admonition >>>>>>> > in the Spark 4.0.x release notes that a prior upgrade to 3.5.5 is >>>>>>> > required to maintain the integrity of already existing data >>>>>>> streams, >>>>>>> > and I see assertions about the maintenance burden that including >>>>>>> the >>>>>>> > migration logic would impose on future Spark versions, but I don't >>>>>>> > think that I see any other technical objections. I do not believe >>>>>>> that >>>>>>> > the claimed technical justification is valid. >>>>>>> > >>>>>>> > In requiring that a veto of a code change be accompanied by a >>>>>>> > technical justification for the veto, the Apache Voting Process >>>>>>> states >>>>>>> > that: "To prevent vetoes from being used capriciously, the voter >>>>>>> must >>>>>>> > provide with the veto a technical justification showing why the >>>>>>> change >>>>>>> > is bad (opens a security exposure, negatively affects performance, >>>>>>> > etc. ). A veto without a justification is invalid and has no >>>>>>> weight." >>>>>>> > This strongly implies that there must be something objectively >>>>>>> wrong >>>>>>> > with the proposed code change in that it causes significant harm in >>>>>>> > the way of opening a security exposure, negatively affecting >>>>>>> > performance, or presumably other significant user harms or perhaps >>>>>>> > even developer burdens. >>>>>>> > >>>>>>> > The proposed addition of the migration logic to Spark 4.0.x does >>>>>>> not >>>>>>> > cause any harm to Spark's users. For many users, those not using >>>>>>> > streaming data, the change will have no effect. For streaming users >>>>>>> > the change will be beneficial, not harmful. >>>>>>> > >>>>>>> > Neither do I find the claim of excessive, ongoing developer burden >>>>>>> to >>>>>>> > be persuasive. The changes are tiny and easily maintained -- in >>>>>>> fact, >>>>>>> > it wouldn't surprise me if no further changes to this migration >>>>>>> logic >>>>>>> > would be needed for a very long time. >>>>>>> > >>>>>>> > Some of what we are left with is just an expression of preference >>>>>>> for >>>>>>> > a technical alternative to the migration logic -- i.e. including in >>>>>>> > the release notes an admonition to first upgrade to 3.5.5. But the >>>>>>> > Apache Voting Process does not say that in the face of code >>>>>>> > alternatives A and B, a qualified voter is justified in vetoing A >>>>>>> if >>>>>>> > they prefer B. Instead, the Voting Process strongly implies that >>>>>>> > something more is needed to justify a veto, as I've already >>>>>>> covered. >>>>>>> > Thus I don't find Dongjoon's preference for the release notes >>>>>>> option >>>>>>> > to be adequate justification for the veto. >>>>>>> > >>>>>>> > The only remaining question I see is whether including >>>>>>> "databricks" in >>>>>>> > the Apache Code is ever allowed or if any such instance must be >>>>>>> > expunged as soon as possible. I am not aware of any ASF policy that >>>>>>> > strictly forbids the mention of a vendor in Apache code for any >>>>>>> > reason, even if that vendor has a product based on Apache code, >>>>>>> even >>>>>>> > if that vendor enjoys a uniquely influential position vis a vis >>>>>>> some >>>>>>> > Apache code or project. Certainly the PMC has a duty to see to it >>>>>>> that >>>>>>> > neither Databricks nor any other vendor exercises influence or >>>>>>> control >>>>>>> > over Apache Spark outside of the established Apache process, but >>>>>>> the >>>>>>> > proposed migration code changes do not advantage Databricks -- if >>>>>>> > anything they remove a minor avenue of influence, and simply need >>>>>>> to >>>>>>> > mention "databricks" once in order match and transform a >>>>>>> configuration >>>>>>> > into a vendor neutral equivalent. While not optimal, I can't find >>>>>>> such >>>>>>> > a one-time inclusion of "databricks" to be truly offensive to any >>>>>>> > non-technical policy concern -- certainly not offensive to the >>>>>>> point >>>>>>> > that it outweighs the user advantage of including the migration >>>>>>> logic >>>>>>> > in Spark 4.0.x. >>>>>>> > >>>>>>> > In summary, I do not find Dongjoon's given technical justification >>>>>>> to >>>>>>> > be valid relative to the Apache requirements for a veto of a code >>>>>>> > change, so I must vote... >>>>>>> > >>>>>>> > +1 >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>> >>>>>>>