Thanks for the summary,

So, someone discloses a 0 day vulnerability on a dependency from arrow/js,
and the maintainers release a new backward-compatible fix, but they do a
major release instead (1.9.3 to 2.0.0). Since npm uses semver, we must bump
it on our package.json (i.e. ^1.9.3 to ^2.0.0). This requires a major
release of Arrow libraries. So, we are all now under pressure during a 0
day incident to release a new version of arrow.
We finally release it, and arrow-js is backward incompatible. Now everyone
depending on arrow-js will also have to bump arrow on their own
`package.json` (e.g. ^1.0.0 -> ^2.0.0). Since our release is backward
incompatible, they will have to perform a code migration, and so now they
are the ones under pressure.

Fortunately for the community, only arrow releases security patches on top
of backward incompatible changes. However, the moment other projects start
doing this, this process grows exponentially throughout the dependency
tree. Also note that this is not an issue of js; it happens on any
programming language that arrow maintains whose package manager uses semver
for dependency resolution (npm, pip, cargo, etc), more dramatically, we are
connecting the dependency tree of cargo with the dependency tree from pip
and npm by aligning all our libraries under the same version.

If we had not released backward incompatible code along with our security
fix, our dependencies only needed to run `npm audit fix` to update their
package.lock (or requirements.txt, or whatever).

>From all of this, I conclude that our versioning strategy implies that:

1. we do not have stable library releases: every release is potentially
backward incompatible, including security patches.

2. we get and cause significant pressure in the release process of a 0 day
vulnerability security patch, either affecting arrow directly or through
some of its dependencies on _any_ of its language-specific libraries.

Anyway, there is a consensus, so you likely thought this through more than
I and weighed it in the decision. Thus, thank you for the clarification and
great work on this awesome project.

Best,
Jorge



On Mon, Jul 27, 2020 at 11:35 PM Wes McKinney <wesmck...@gmail.com> wrote:

> Yes, the TL;DR is that we do not at this time intend to make minor
> LIBRARY releases in SemVer parlance, even if there are no backwards
> incompatible changes. Either we will make Major releases or Patch
> releases of the libraries. We will likely make minor releases of the
> columnar protocol, though.
>
> The other questions are handled in the Versioning document, we are now
> observing a dual-versioning scheme with FORMAT version being separate
> from LIBRARY version. Each version of the libraries will have a
> corresponding FORMAT version, and the format version will change more
> slowly than the libraries. So LIBRARY version 2.0.0 may use FORMAT
> version 1.0.0 unless new features are added in which case the format
> version may be 1.1.0
>
> On Mon, Jul 27, 2020 at 11:54 AM Neal Richardson
> <neal.p.richard...@gmail.com> wrote:
> >
> > https://arrow.apache.org/docs/format/Versioning.html is the statement
> that
> > came from the resolution of the previous discussion. IIRC the discussion
> > came between the 0.15 and 0.16 releases, if you want to search the
> mailing
> > list archives.
> >
> > I wouldn't want to speak for everyone, but I believe there are a few
> things
> > at play:
> >
> > * Release logistics: I believe the community has decided that it wants to
> > continue releasing all components at the same time, in which case having
> a
> > single release number greatly simplifies things.
> > * Compatibility of libraries: it's a lot easier to know that two
> libraries
> > in different languages are compatible because they have the same number.
> > * Version numbers are cheap, and (IMO) there's little useful information
> in
> > version numbers other than "higher means newer" (unless you're in Python
> > and have parallel major releases for years ;)
> >
> > While I might also question whether the next release for the library I'm
> > working on "should" have a major or minor version bump, I'm skeptical
> that
> > having that autonomy is worth the maintenance cost.
> >
> > Neal
> >
> >
> > On Mon, Jul 27, 2020 at 9:37 AM Jorge Cardoso Leitão <
> > jorgecarlei...@gmail.com> wrote:
> >
> > > Hi
> > >
> > > First off, congrats for the 1.0.0 release!
> > >
> > > I am writing because I am trying to understand the versioning schema we
> > > will use going onwards.
> > >
> > > AFAI understand, 1.0.0 was assigned to all subcomponents of arrow.
> I.e. I
> > > can now use pyarrow and assign something like >=1,<2 on a setup.py.
> > >
> > > However, looking at other parts of the project, I get the feeling that
> > > these components are less mature / more recent, and likely need more
> > > backward incompatible changes until a stable API is achieved. In other
> > > words, within arrow, I get the feeling that different parts are at
> > > significantly different stages of their development lifetime.
> > >
> > > How are we planning to reconcile this fact? E.g. I can see pyarrow not
> > > wanting to bump from 1 to 2 since no backward incompatible change was
> > > introduced, while other components have backward incompatible changes.
> > >
> > > A related question: what exactly are we versioning with this 1.0.0? The
> > > protocol? The individual APIs? The project as a whole?
> > >
> > > In my view, there is a case here to _not_ align the versions of the
> > > different components, exactly to avoid having one component's version
> (e.g.
> > > pyarrow) be dependent on other's code (e.g. rust arrow). However, I
> suspect
> > > that this discussion has already taken place and I have been unable to
> find
> > > a summary of it.
> > >
> > > Best,
> > > Jorge
> > >
>

Reply via email to