[VOTE] Technical Justification for the veto of the "Retain migration logic..." code change proposal is not valid

Mark Hamstra Sat, 15 Mar 2025 16:36:27 -0700

There has been enough discussion on this topic already, so I think
that an immediate vote on the validity of Dongjoon's technical
justification for his veto of the "Retain migration logic ... in Spark
4.0.x" proposal is in order. That technical justification has been
called into question, and the guidance at
https://www.apache.org/foundation/glossary.html#Veto leaves it to the
PMC to determine whether the technical justification is  valid: "In
case of doubt, deciding whether a technical justification is valid is
up to the PMC." As such, only PMC votes will decide the outcome of
this vote. This is neither a vote on a code change itself not a vote
on whether a package is ready for release, so it a procedural vote on
whether the technical justification is valid. As such, the vote will
be decided by a simple majority where +1 votes hold that the technical
justification is not valid and -1 votes hold that the technical
justification is valid.


I would request that at least PMC members post more than just a naked
vote, but instead endeavor to give some reason why they have assessed
the technical justification as they have. I'll start:

Despite all of the discussion related to Dongjoon's -1 vote, I must
confess to still not being entirely clear on what is his technical
justification for that veto. I see claims that including an admonition
in the Spark 4.0.x release notes that a prior upgrade to 3.5.5 is
required to maintain the integrity of already existing data streams,
and I see assertions about the maintenance burden that including the
migration logic would impose on future Spark versions, but I don't
think that I see any other technical objections. I do not believe that
the claimed technical justification is valid.

In requiring that a veto of a code change be accompanied by a
technical justification for the veto, the Apache Voting Process states
that: "To prevent vetoes from being used capriciously, the voter must
provide with the veto a technical justification showing why the change
is bad (opens a security exposure, negatively affects performance,
etc. ). A veto without a justification is invalid and has no weight."
This strongly implies that there must be something objectively wrong
with the proposed code change in that it causes significant harm in
the way of opening a security exposure, negatively affecting
performance, or presumably other significant user harms or perhaps
even developer burdens.

The proposed addition of the migration logic to Spark 4.0.x does not
cause any harm to Spark's users. For many users, those not using
streaming data, the change will have no effect. For streaming users
the change will be beneficial, not harmful.

Neither do I find the claim of excessive, ongoing developer burden to
be persuasive. The changes are tiny and easily maintained -- in fact,
it wouldn't surprise me if no further changes to this migration logic
would be needed for a very long time.

Some of what we are left with is just an expression of preference for
a technical alternative to the migration logic -- i.e. including in
the release notes an admonition to first upgrade to 3.5.5. But the
Apache Voting Process does not say that in the face of code
alternatives A and B, a qualified voter is justified in vetoing A if
they prefer B. Instead, the Voting Process strongly implies that
something more is needed to justify a veto, as I've already covered.
Thus I don't find Dongjoon's preference for the release notes option
to be adequate justification for the veto.

The only remaining question I see is whether including "databricks" in
the Apache Code is ever allowed or if any such instance must be
expunged as soon as possible. I am not aware of any ASF policy that
strictly forbids the mention of a vendor in Apache code for any
reason, even if that vendor has a product based on Apache code, even
if that vendor enjoys a uniquely influential position vis a vis some
Apache code or project. Certainly the PMC has a duty to see to it that
neither Databricks nor any other vendor exercises influence or control
over Apache Spark outside of the established Apache process, but the
proposed migration code changes do not advantage Databricks -- if
anything they remove a minor avenue of influence, and simply need to
mention "databricks" once in order match and transform a configuration
into a vendor neutral equivalent. While not optimal, I can't find such
a one-time inclusion of "databricks" to be truly offensive to any
non-technical policy concern -- certainly not offensive to the point
that it outweighs the user advantage of including the migration logic
in Spark 4.0.x.

In summary, I do not find Dongjoon's given technical justification to
be valid relative to the Apache requirements for a veto of a code
change, so I must vote...

+1

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

[VOTE] Technical Justification for the veto of the "Retain migration logic..." code change proposal is not valid

Reply via email to