If we are really wanting to make a "correct" discussion going forward, I believe the revert PR has to be merged. After that, either my proposal gets not accepted, or he starts to DISCUSS and eventually reaches the VOTE pass, or we just leave the config to be kept deprecated instead of removed.
We don't need to do this right now because this work is not necessary if this VOTE has passed, but if this VOTE fails, I argue that the revert PR must be merged, because the VOTE just means that he can just block my proposal. It is never meant that he got consensus on his proposal. That VOTE must happen separately, and during the time I want to see the codebase to be "neutral". On Mon, Mar 17, 2025 at 11:36 AM Wenchen Fan <cloud0...@gmail.com> wrote: > I've created the revert PR for branch-4.0: > https://github.com/apache/spark/pull/50291 . We can merge PRs with lazy > consensus but it's clear that this breaking change PR has failed to achieve > consensus. > > I hope we now have a clear foundation for discussing solutions. As it > stands, the misnamed configuration will be released in 4.0.0. I like > Jungtaek’s proposal to deprecate it, but the decision is up to the > community. > > On Mon, Mar 17, 2025 at 10:19 AM Jungtaek Lim < > kabhwan.opensou...@gmail.com> wrote: > >> OK, let's be super honest. >> >> Again, I think you agree that *"both" proposals are "technically" >> correct (or one side can't have a strong theoretical evidence to counter >> the other side)*. So this naturally has a fate to have more supporters >> to get to the end. It's very easy for me to VETO to his proposal (although >> I don't have a binding vote, I think I have people who agree with me) if we >> think we want to definitely expand the interpretation of VETO criteria in >> the Apache Voting Process. >> >> You said it is up to the PMC member exercising the veto to use their >> judgement, but definitely, it must not be used to force the community to >> follow his proposal. The major argument here is, he can just VETO to any >> proposal to retain the codebase as the way he prefers to, which I don't >> believe is a correct usage of VETO. >> >> If we just revert the change of removal of config, this is "really" >> neutral neither my proposal nor his proposal. Do we really want to do so? >> >> On Mon, Mar 17, 2025 at 10:55 AM Holden Karau <holden.ka...@gmail.com> >> wrote: >> >>> First let me start with my key hope: >>> >>> We find a way to compromise and have the veto withdrawn rather than >>> overridden. >>> >>> From what I understand of the change in question: >>> >>> So my understanding, and I may be over simplifying here but there are >>> (at least) three technical paths forward (migration guide, legacy config >>> with vendor string in it, non-vendor specific string legacy config), a PMC >>> member vetoed one of them (named vendor legacy config) because he thought a >>> different approach was better (migration guide) as they were worried that >>> carrying that legacy config forward would encourage bad coding standards >>> (eg we would add more vendor named config flags). To me that seems like a >>> valid concern. >>> >>> My reasoning: >>> >>> Thinking back at other VETOs that I’ve been involved with in this >>> project (DSV2, graceful decom, etc) this seems to meet the same bar. Hell >>> we’ve had plenty of vetos that didn’t offer an alternative. >>> >>> My personal understanding of where the bar for “ >>> a technical justification showing why the change is bad” concern is >>> pretty much “any not factually incorrect reasoning”, the text doesn’t have >>> any particular “bar” for the level of “badness” and I think it’s up to the >>> PMC member exercising the veto to use their judgement. >>> >>> In closing, I feel like the path we’re going down (overriding a veto) is >>> not healthy for the project. >>> >>> Twitter: https://twitter.com/holdenkarau >>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>> <https://www.fighthealthinsurance.com/?q=hk_email> >>> Books (Learning Spark, High Performance Spark, etc.): >>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>> Pronouns: she/her >>> >>> >>> On Sun, Mar 16, 2025 at 6:28 PM Jungtaek Lim < >>> kabhwan.opensou...@gmail.com> wrote: >>> >>>> Holden, I believe you should already know "both" approaches are >>>> "technically" correct. It's not about which one you have a preference for, >>>> no, this VOTE is not intended to extend the debate. >>>> >>>> Again, what you are encouraged to do here is, not exposing your >>>> preference of two approaches, but exposing your "technically valid" concern >>>> of my approach, backed by Dongjoon's veto (most likely you want to quote >>>> Dongjoon's post). This is very simple and I'm not sure you are doing >>>> exactly what the VOTE requires. >>>> >>>> On Mon, Mar 17, 2025 at 6:32 AM Holden Karau <holden.ka...@gmail.com> >>>> wrote: >>>> >>>>> -1 (binding) — to me it doesn’t matter that the cost is low if the >>>>> objection is technical then I think we need to respect the veto. There is >>>>> a >>>>> fundamental disagreement as to what the correct technical way to address >>>>> this problem is (removal + documentation vs legacy config) and a PMC >>>>> member >>>>> has vetoed the legacy config option. >>>>> >>>>> I think I disagree with Mark on the assertion that the veto needs to >>>>> have “substantial technical concern,” but rather a valid concern. I think >>>>> in addition to the veto they’ve also gone above and beyond providing >>>>> alternative ways to accomplish this. >>>>> >>>>> On a personal level: >>>>> >>>>> I am optimistic we can unblock the release but I think it’s important >>>>> to err on the side of respecting the veto here in the interest of >>>>> perceived >>>>> fairness *especially* because of vendor aspects. >>>>> >>>>> To be clear I’ve worked at most of these companies (and many of the >>>>> people) and I’m not ascribing malice to anyone in this, I think mistakes >>>>> happen (god knows I’ve had a fair share). I think we’re all doing our best >>>>> here and would ask that we show everyone understanding regardless of the >>>>> outcome. >>>>> >>>>> Sending hugs and good vibes to y’all. >>>>> >>>>> Twitter: https://twitter.com/holdenkarau >>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>> Pronouns: she/her >>>>> >>>>> >>>>> On Sat, Mar 15, 2025 at 5:07 PM Holden Karau <holden.ka...@gmail.com> >>>>> wrote: >>>>> >>>>>> Given it’s the weekend maybe let’s give folks at least one full work >>>>>> day. >>>>>> >>>>>> Twitter: https://twitter.com/holdenkarau >>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>> Pronouns: she/her >>>>>> >>>>>> >>>>>> On Sat, Mar 15, 2025 at 4:44 PM Mark Hamstra <markhams...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Quick administrative note: I don't see any reason why this vote >>>>>>> should >>>>>>> take a long time, so I expect to close the process and tally the >>>>>>> votes >>>>>>> in not much more than 48 hours. >>>>>>> >>>>>>> On Sat, Mar 15, 2025 at 4:35 PM Mark Hamstra <markhams...@gmail.com> >>>>>>> wrote: >>>>>>> > >>>>>>> > There has been enough discussion on this topic already, so I think >>>>>>> > that an immediate vote on the validity of Dongjoon's technical >>>>>>> > justification for his veto of the "Retain migration logic ... in >>>>>>> Spark >>>>>>> > 4.0.x" proposal is in order. That technical justification has been >>>>>>> > called into question, and the guidance at >>>>>>> > https://www.apache.org/foundation/glossary.html#Veto leaves it to >>>>>>> the >>>>>>> > PMC to determine whether the technical justification is valid: "In >>>>>>> > case of doubt, deciding whether a technical justification is valid >>>>>>> is >>>>>>> > up to the PMC." As such, only PMC votes will decide the outcome of >>>>>>> > this vote. This is neither a vote on a code change itself not a >>>>>>> vote >>>>>>> > on whether a package is ready for release, so it a procedural vote >>>>>>> on >>>>>>> > whether the technical justification is valid. As such, the vote >>>>>>> will >>>>>>> > be decided by a simple majority where +1 votes hold that the >>>>>>> technical >>>>>>> > justification is not valid and -1 votes hold that the technical >>>>>>> > justification is valid. >>>>>>> > >>>>>>> > I would request that at least PMC members post more than just a >>>>>>> naked >>>>>>> > vote, but instead endeavor to give some reason why they have >>>>>>> assessed >>>>>>> > the technical justification as they have. I'll start: >>>>>>> > >>>>>>> > Despite all of the discussion related to Dongjoon's -1 vote, I must >>>>>>> > confess to still not being entirely clear on what is his technical >>>>>>> > justification for that veto. I see claims that including an >>>>>>> admonition >>>>>>> > in the Spark 4.0.x release notes that a prior upgrade to 3.5.5 is >>>>>>> > required to maintain the integrity of already existing data >>>>>>> streams, >>>>>>> > and I see assertions about the maintenance burden that including >>>>>>> the >>>>>>> > migration logic would impose on future Spark versions, but I don't >>>>>>> > think that I see any other technical objections. I do not believe >>>>>>> that >>>>>>> > the claimed technical justification is valid. >>>>>>> > >>>>>>> > In requiring that a veto of a code change be accompanied by a >>>>>>> > technical justification for the veto, the Apache Voting Process >>>>>>> states >>>>>>> > that: "To prevent vetoes from being used capriciously, the voter >>>>>>> must >>>>>>> > provide with the veto a technical justification showing why the >>>>>>> change >>>>>>> > is bad (opens a security exposure, negatively affects performance, >>>>>>> > etc. ). A veto without a justification is invalid and has no >>>>>>> weight." >>>>>>> > This strongly implies that there must be something objectively >>>>>>> wrong >>>>>>> > with the proposed code change in that it causes significant harm in >>>>>>> > the way of opening a security exposure, negatively affecting >>>>>>> > performance, or presumably other significant user harms or perhaps >>>>>>> > even developer burdens. >>>>>>> > >>>>>>> > The proposed addition of the migration logic to Spark 4.0.x does >>>>>>> not >>>>>>> > cause any harm to Spark's users. For many users, those not using >>>>>>> > streaming data, the change will have no effect. For streaming users >>>>>>> > the change will be beneficial, not harmful. >>>>>>> > >>>>>>> > Neither do I find the claim of excessive, ongoing developer burden >>>>>>> to >>>>>>> > be persuasive. The changes are tiny and easily maintained -- in >>>>>>> fact, >>>>>>> > it wouldn't surprise me if no further changes to this migration >>>>>>> logic >>>>>>> > would be needed for a very long time. >>>>>>> > >>>>>>> > Some of what we are left with is just an expression of preference >>>>>>> for >>>>>>> > a technical alternative to the migration logic -- i.e. including in >>>>>>> > the release notes an admonition to first upgrade to 3.5.5. But the >>>>>>> > Apache Voting Process does not say that in the face of code >>>>>>> > alternatives A and B, a qualified voter is justified in vetoing A >>>>>>> if >>>>>>> > they prefer B. Instead, the Voting Process strongly implies that >>>>>>> > something more is needed to justify a veto, as I've already >>>>>>> covered. >>>>>>> > Thus I don't find Dongjoon's preference for the release notes >>>>>>> option >>>>>>> > to be adequate justification for the veto. >>>>>>> > >>>>>>> > The only remaining question I see is whether including >>>>>>> "databricks" in >>>>>>> > the Apache Code is ever allowed or if any such instance must be >>>>>>> > expunged as soon as possible. I am not aware of any ASF policy that >>>>>>> > strictly forbids the mention of a vendor in Apache code for any >>>>>>> > reason, even if that vendor has a product based on Apache code, >>>>>>> even >>>>>>> > if that vendor enjoys a uniquely influential position vis a vis >>>>>>> some >>>>>>> > Apache code or project. Certainly the PMC has a duty to see to it >>>>>>> that >>>>>>> > neither Databricks nor any other vendor exercises influence or >>>>>>> control >>>>>>> > over Apache Spark outside of the established Apache process, but >>>>>>> the >>>>>>> > proposed migration code changes do not advantage Databricks -- if >>>>>>> > anything they remove a minor avenue of influence, and simply need >>>>>>> to >>>>>>> > mention "databricks" once in order match and transform a >>>>>>> configuration >>>>>>> > into a vendor neutral equivalent. While not optimal, I can't find >>>>>>> such >>>>>>> > a one-time inclusion of "databricks" to be truly offensive to any >>>>>>> > non-technical policy concern -- certainly not offensive to the >>>>>>> point >>>>>>> > that it outweighs the user advantage of including the migration >>>>>>> logic >>>>>>> > in Spark 4.0.x. >>>>>>> > >>>>>>> > In summary, I do not find Dongjoon's given technical justification >>>>>>> to >>>>>>> > be valid relative to the Apache requirements for a veto of a code >>>>>>> > change, so I must vote... >>>>>>> > >>>>>>> > +1 >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>> >>>>>>>