On a little higher level, not restricted to just this issue/PR, there
is a distinct difference between "if there is no regression, then we
can release without fixing the issue" and "if there is no regression,
then we must release without fixing the issue". I don't believe that
the latter has ever been established as agreed upon policy in the
Spark project. I also don't believe that it is a good policy: there
are issues worth taking the time to fix (or at least carefully
discuss) even if they are not regressions.

On Tue, Dec 16, 2025 at 5:54 AM Herman van Hovell via dev
<[email protected]> wrote:
>
> Dongjoon,
>
> I have a couple of problems with this course of action:
>
> You seem to be favoring speed over quality here. Even if my vote were 
> erroneous, you should give me more than two hours to respond. This is a 
> global community, not everyone is awake at the same time. As far as I know we 
> try to follow a consensus driven decision making process here; this seems to 
> be diametrically opposed to that.
> The problem itself is serious since it can cause driver crashes. In general I 
> believe that we should not be in the business of shipping obviously broken 
> things. The only thing you are doing now is increase toil by forcing us to 
> release a patch version almost immediately.
> The offending change was backported to a maintenance release. That is 
> something different than it being a previously known problem.
> I am not sure I follow the PR argument. You merged my initial PR without even 
> checking in with me. That PR fixed the issue, it just needed proper tests and 
> some touch-ups (again quality is important). I open a follow-up that contains 
> proper testing, and yes this fails because of a change in error types, it 
> happens, I will fix it. The statement that we don't have a fix is untrue, the 
> fact that you state otherwise makes me seriously doubt your judgement here. 
> You could have asked me or someone else, you could have leaned in and checked 
> it yourself.
>
> I would like to understand why there is such a rush here.
>
> Kind regards,
> Herman
>
> On Tue, Dec 16, 2025 at 7:27 AM Dongjoon Hyun <[email protected]> wrote:
>>
>> After rechecking, this vote passed.
>>
>> I'll send a vote result email.
>>
>> Dongjoon.
>>
>> On 2025/12/16 11:03:39 Dongjoon Hyun wrote:
>> > Hi, All.
>> >
>> > I've been working with Herman's PRs so far.
>> >
>> > As a kind of fact checking, I need to correct two things in RC3 thread.
>> >
>> > First, Herman claimed that he found a regression of Apache Spark 4.1.0, 
>> > but actually it's not true because Apache Spark 4.0.1 also has SPARK-53342 
>> > since 2025-09-06.
>> >
>> > Second, although Herman shared us a patch since last Friday, Herman also 
>> > made another PR containing the main code change 9 hours ago. In addition, 
>> > unfortunately, it also didn't pass our CIs yet. It simply means that there 
>> > is no complete patch yet in the community for both Apache Spark 4.1.0 and 
>> > 4.0.2.
>> >
>> > https://github.com/apache/spark/pull/53480
>> > ([SPARK-54696][CONNECT] Clean-up Arrow Buffers - follow-up)
>> >
>> > In short, he seems to block RC3 as a mistake. I'm re-checking the 
>> > situation around RC3 vote and `branch-4.1` situation.
>> >
>> > Dongjoon.
>> >
>> > > > >
>> > > > > On 2025/12/15 14:59:32 Herman van Hovell via dev wrote:
>> > > > > > I pasted a non-existing link for the root cause. The actual link 
>> > > > > > is here:
>> > > > > > https://issues.apache.org/jira/browse/SPARK-53342
>> > > > > >
>> > > > > >
>> > > > > > On Mon, Dec 15, 2025 at 10:47 AM Herman van Hovell <
>> > > > > [email protected]>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Hey Dongjoon,
>> > > > > > >
>> > > > > > > Regarding your questions.
>> > > > > > >
>> > > > > > >    1. If you define a large-ish local relation (which makes us 
>> > > > > > > cache it
>> > > > > > >    on the serverside) and keep using it, then leak off-heap 
>> > > > > > > memory
>> > > > > every time
>> > > > > > >    it is being used. At some point the OS will OOM kill the 
>> > > > > > > driver.
>> > > > > While I
>> > > > > > >    have a repro, testing it like this in CI is not a good idea. 
>> > > > > > > As an
>> > > > > > >    alternative I am working on a test that checks buffer 
>> > > > > > > clean-up.For
>> > > > > the
>> > > > > > >    record I don't appreciate the term `claim` here; I am not 
>> > > > > > > blocking a
>> > > > > > >    release without genuine concern.
>> > > > > > >    2. The root cause is
>> > > > > > >    https://databricks.atlassian.net/browse/SPARK-53342 and not 
>> > > > > > > the
>> > > > > large
>> > > > > > >    local relations work.
>> > > > > > >    3. A PR has been open since Friday:
>> > > > > > >    https://github.com/apache/spark/pull/53452. I hope that I can 
>> > > > > > > get
>> > > > > it
>> > > > > > >    merged today.
>> > > > > > >    4. I don't see a reason why.
>> > > > > > >
>> > > > > > > Cheers,
>> > > > > > > Herman
>> > > > > > >
>> > > > > > > On Mon, Dec 15, 2025 at 5:47 AM Dongjoon Hyun 
>> > > > > > > <[email protected]>
>> > > > > wrote:
>> > > > > > >
>> > > > > > >> How can we verify the regression, Herman?
>> > > > > > >>
>> > > > > > >> It's a little difficult for me to evaluate your claim so far 
>> > > > > > >> due to
>> > > > > the
>> > > > > > >> lack of the shared information. Specifically, there is no 
>> > > > > > >> update for
>> > > > > last 3
>> > > > > > >> days on "SPARK-54696 (Spark Connect LocalRelation support leak
>> > > > > off-heap
>> > > > > > >> memory)" after you created it.
>> > > > > > >>
>> > > > > > >> Could you provide us more technical information about your Spark
>> > > > > Connect
>> > > > > > >> issue?
>> > > > > > >>
>> > > > > > >> 1. How can we reproduce your claim? Do you have a test case?
>> > > > > > >>
>> > > > > > >> 2. For the root cause, I'm wondering if you are saying literally
>> > > > > > >> SPARK-53917 (Support large local relations) or another JIRA 
>> > > > > > >> issue.
>> > > > > Which
>> > > > > > >> commit is the root cause?
>> > > > > > >>
>> > > > > > >> 3. Since you assigned SPARK-54696 to yourself for last 3 days, 
>> > > > > > >> do you
>> > > > > > >> want to provide a PR soon?
>> > > > > > >>
>> > > > > > >> 4. If you need more time, shall we simply revert the root cause 
>> > > > > > >> from
>> > > > > > >> Apache Spark 4.1.0 ?
>> > > > > > >>
>> > > > > > >> Thanks,
>> > > > > > >> Dongjoon
>> > > > > > >>
>> > > > > > >> On 2025/12/14 23:29:59 Herman van Hovell via dev wrote:
>> > > > > > >> > Yes. It is a regression in Spark 4.1. The root cause is a 
>> > > > > > >> > change
>> > > > > where
>> > > > > > >> we
>> > > > > > >> > fail to clean-up allocated (off-heap) buffers.
>> > > > > > >> >
>> > > > > > >> > On Sun, Dec 14, 2025 at 4:25 AM Dongjoon Hyun 
>> > > > > > >> > <[email protected]>
>> > > > > > >> wrote:
>> > > > > > >> >
>> > > > > > >> > > Hi, Herman.
>> > > > > > >> > >
>> > > > > > >> > > Do you mean that is a regression at Apache Spark 4.1.0?
>> > > > > > >> > >
>> > > > > > >> > > If then, do you know what was the root cause?
>> > > > > > >> > >
>> > > > > > >> > > Dongjoon.
>> > > > > > >> > >
>> > > > > > >> > > On 2025/12/13 23:09:02 Herman van Hovell via dev wrote:
>> > > > > > >> > > > -1. We need to get
>> > > > > > >> https://issues.apache.org/jira/browse/SPARK-54696
>> > > > > > >> > > fixed.
>> > > > > > >> > > >
>> > > > > > >> > > > On Sat, Dec 13, 2025 at 11:07 AM Jules Damji <
>> > > > > [email protected]
>> > > > > > >> >
>> > > > > > >> > > wrote:
>> > > > > > >> > > >
>> > > > > > >> > > > > +1 non-binding
>> > > > > > >> > > > > —
>> > > > > > >> > > > > Sent from my iPhone
>> > > > > > >> > > > > Pardon the dumb thumb typos :)
>> > > > > > >> > > > >
>> > > > > > >> > > > > > On Dec 11, 2025, at 8:34 AM, [email protected] 
>> > > > > > >> > > > > > wrote:
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > Please vote on releasing the following candidate as 
>> > > > > > >> > > > > > Apache
>> > > > > > >> Spark
>> > > > > > >> > > > > version 4.1.0.
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > The vote is open until Sun, 14 Dec 2025 09:34:31 PST 
>> > > > > > >> > > > > > and
>> > > > > passes
>> > > > > > >> if a
>> > > > > > >> > > > > majority +1 PMC votes are cast, with
>> > > > > > >> > > > > > a minimum of 3 +1 votes.
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > [ ] +1 Release this package as Apache Spark 4.1.0
>> > > > > > >> > > > > > [ ] -1 Do not release this package because ...
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > To learn more about Apache Spark, please see
>> > > > > > >> > > https://spark.apache.org/
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > The tag to be voted on is v4.1.0-rc3 (commit 
>> > > > > > >> > > > > > e221b56be7b):
>> > > > > > >> > > > > > https://github.com/apache/spark/tree/v4.1.0-rc3
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > The release files, including signatures, digests, 
>> > > > > > >> > > > > > etc. can
>> > > > > be
>> > > > > > >> found
>> > > > > > >> > > at:
>> > > > > > >> > > > > >
>> > > > > https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-bin/
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > Signatures used for Spark RCs can be found in this 
>> > > > > > >> > > > > > file:
>> > > > > > >> > > > > > https://downloads.apache.org/spark/KEYS
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > The staging repository for this release can be found 
>> > > > > > >> > > > > > at:
>> > > > > > >> > > > > >
>> > > > > > >> > >
>> > > > > > >>
>> > > > > https://repository.apache.org/content/repositories/orgapachespark-1508/
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > The documentation corresponding to this release can be
>> > > > > found at:
>> > > > > > >> > > > > >
>> > > > > https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-docs/
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > The list of bug fixes going into 4.1.0 can be found 
>> > > > > > >> > > > > > at the
>> > > > > > >> following
>> > > > > > >> > > URL:
>> > > > > > >> > > > > >
>> > > > > https://issues.apache.org/jira/projects/SPARK/versions/12355581
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > FAQ
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > =========================
>> > > > > > >> > > > > > How can I help test this release?
>> > > > > > >> > > > > > =========================
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > If you are a Spark user, you can help us test this 
>> > > > > > >> > > > > > release
>> > > > > by
>> > > > > > >> taking
>> > > > > > >> > > > > > an existing Spark workload and running on this release
>> > > > > > >> candidate,
>> > > > > > >> > > then
>> > > > > > >> > > > > > reporting any regressions.
>> > > > > > >> > > > > >
>> > > > > > >> > > > > > If you're working in PySpark you can set up a virtual 
>> > > > > > >> > > > > > env
>> > > > > and
>> > > > > > >> install
>> > > > > > >> > > > > > the current RC via "pip install
>> > > > > > >> > > > >
>> > > > > > >> > >
>> > > > > > >>
>> > > > > https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-bin/pyspark-4.1.0.tar.gz
>> > > > > > >> > > > > "
>> > > > > > >> > > > > > and see if anything important breaks.
>> > > > > > >> > > > > > In the Java/Scala, you can add the staging repository 
>> > > > > > >> > > > > > to
>> > > > > your
>> > > > > > >> > > project's
>> > > > > > >> > > > > resolvers and test
>> > > > > > >> > > > > > with the RC (make sure to clean up the artifact cache
>> > > > > > >> before/after so
>> > > > > > >> > > > > > you don't end up building with an out of date RC going
>> > > > > forward).
>> > > > > > >> > > > > >
>> > > > > > >> > > > > >
>> > > > > > >> ---------------------------------------------------------------------
>> > > > > > >> > > > > > To unsubscribe e-mail: 
>> > > > > > >> > > > > > [email protected]
>> > > > > > >> > > > > >
>> > > > > > >> > > > >
>> > > > > > >> > > > >
>> > > > > > >> ---------------------------------------------------------------------
>> > > > > > >> > > > > To unsubscribe e-mail: [email protected]
>> > > > > > >> > > > >
>> > > > > > >> > > > >
>> > > > > > >> > > >
>> > > > > > >> > >
>> > > > > > >> > >
>> > > > > ---------------------------------------------------------------------
>> > > > > > >> > > To unsubscribe e-mail: [email protected]
>> > > > > > >> > >
>> > > > > > >> > >
>> > > > > > >> >
>> > > > > > >>
>> > > > > > >> ---------------------------------------------------------------------
>> > > > > > >> To unsubscribe e-mail: [email protected]
>> > > > > > >>
>> > > > > > >>
>> > > > > >
>> > > > >
>> > > > > ---------------------------------------------------------------------
>> > > > > To unsubscribe e-mail: [email protected]
>> > > > >
>> > > > >
>> > > >
>> > >
>> > > ---------------------------------------------------------------------
>> > > To unsubscribe e-mail: [email protected]
>> > >
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe e-mail: [email protected]
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: [email protected]
>>

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Reply via email to