+1 for incrementing the major number for library compatibility changes and
decoupling the release versions across languages.

I agree with the points that Sean made. This is confusing for users, and
needlessly so because there is only one binary format and I don't think
there are major pushes to introduce a breaking change.

On Mon, Mar 2, 2020 at 7:48 AM Sean Busbey <bus...@apache.org> wrote:

> > So, as I understand it. The whole 1.x version should be binary
> compatible.
>
> the term "binary compatible" is overloaded, thanks to our
> participation in the Java ecosystem. Every data file written in Avro
> 1.y is intended to be readable by every other Avro 1.y release,
> regardless of language. That is true even if we know that there are
> cases where errors in various language libraries has prevented
> success.
>
> In Java ecosystem parlance the term "binary compatible" also refers to
> the ability to use a Java library in place of another without needing
> to recompile any code that refers to said library. It is definitely
> not the case that every Avro 1.y Java version has been binary
> compatible in this sense (in fact just the opposite).
>
> > For 1.9 we broke some of the API's because, at the time, the decision was
> > that removing Jackson from the public API was required to move from
> > Codehaus jackson (1.x) to the Fasterxml one (2.x). The public API
> shouldn't
> > have exposed these methods in the first place.
>
> I think this is confusing the two issues of "compatibility of
> serialized bytes" and "compatibility of language APIs". This is part
> of why I think we should stop relying on the first version number to
> indicate "compatibility of serialized bytes".
>
> > I wouldn't be in favor of switching to 10.x (dropping the 1. in front of
> > it). What's the added value in this? I'm just afraid of changing this,
> > would confuse a lot of downstream users.
>
> The big advantage is that literally everywhere else in the software
> ecosystems I've seen the first number in a version string is either
> "major version" or "marketing version", usually the former. Folks
> expect that if that version number hasn't changed then they should be
> able to "easily" upgrade to use the newer library. In Avro that
> plainly isn't true. I can think of multiple cases where other ASF
> projects have gotten surprised that going from e.g. Java libraries for
> Avro 1.7 to Avro 1.8 was a major version bump.
>
> I agree that going from 1.x.y to 10.y.z might be confusing due to the
> large number jump. I think going to "2.y.z" would clearly indicate
> folks needed to pay attention to a version difference because the
> first number changed. When we have their attention we can explain that
> we're using it as major version from now on.
>
>
> > Also, a similar discussion was on the Spark devlist, I think Michael has
> > some valid points here:
> > https://mail-archives.apache.org/mod_mbox/spark-dev/202002.mbox/browser
>
> This is a month of email from dev@spark. Could you link to a specific
> thread on lists.apache.org or provide a subject line?
>
> On Mon, Mar 2, 2020 at 3:00 AM Driesprong, Fokko <fo...@driesprong.frl>
> wrote:
> >
> > So, as I understand it. The whole 1.x version should be binary
> compatible.
> > So anything is written with Java 1.x should be readable with Python 1.x.
> > We've been working on extending the integration tests as well.
> >
> > Not all languages support all the features, for example, many languages
> > lack support for logical types. In this case, a datetime would be just
> read
> > as an integer, so there is a fallback scenario.
> >
> > For 1.9 we broke some of the API's because, at the time, the decision was
> > that removing Jackson from the public API was required to move from
> > Codehaus jackson (1.x) to the Fasterxml one (2.x). The public API
> shouldn't
> > have exposed these methods in the first place.
> >
> > I wouldn't be in favor of switching to 10.x (dropping the 1. in front of
> > it). What's the added value in this? I'm just afraid of changing this,
> > would confuse a lot of downstream users.
> >
> > Also, a similar discussion was on the Spark devlist, I think Michael has
> > some valid points here:
> > https://mail-archives.apache.org/mod_mbox/spark-dev/202002.mbox/browser
> >
> > Maybe it is good to formalize our policy, and put it on the website.
> >
> > Cheers, Fokko Driesprong
> >
> > Op vr 28 feb. 2020 om 17:53 schreef Sean Busbey <bus...@apache.org>:
> >
> > > Counterpoint on independently versioning the various languages. Do we
> > > know if Python Avro X works with Java Avro Y as it is? It seems like
> > > we already get surprised pretty often when they don't.
> > >
> > > If we stop including the "data compatibility version" or whatever
> > > we're calling the first number, we'll need to get more formal on
> > > versioning the specification and having libraries plainly label which
> > > specification(s) they comport to.
> > >
> > > At the very least it seems like we'd make the _easy_ path easier for
> > > the languages that are well maintained. Sure it'll be burden on those
> > > languages that aren't well maintained, but it seems like those are
> > > already in that position?
> > >
> > > On Thu, Feb 27, 2020 at 9:13 AM Ismaël Mejía <ieme...@gmail.com>
> wrote:
> > > >
> > > > Bringing my comment from the JIRA ticket here for discussion:
> > > >
> > > > > "One argument against semantic versioning is the fact that Avro
> > > supports
> > > > 9 language APIs, so if let's say C++ breaks its backwards
> compatibility
> > > > should we move the version number up for every single language?
> Sounds
> > > like
> > > > a burden and in particular a not easy task to track since we do not
> have
> > > > proper validation of breaking changes in place for every language at
> this
> > > > point.
> > > > > ... (even if we separate release numbers per language) that seems
> like
> > > a
> > > > lot of work for probably a similar output because then users will
> doubt,
> > > > wait is Python Avro 3.1.0 compatible with Java Avro 5.2.0? and they
> will
> > > > probably be for the binary format."
> > > >
> > > > Also there is the case of interop tests, how will those act in this
> case.
> > > > We will need a compatibility matrix, again I am not sure if it is the
> > > best
> > > > approach, looks like lots of work for not much in return.
> > > >
> > > >
> > > >
> > > > On Thu, Feb 27, 2020 at 12:21 PM Ryan Skraba <r...@skraba.com>
> wrote:
> > > >
> > > > > Hello!  Resurrecting -- I think this was the last thread bringing
> up
> > > this
> > > > > issue!
> > > > >
> > > > > Since we've talked about releasing 1.10.x in May, and it's a nice
> > > > > round number... what do you think about
> > > > >
> > > > > 1) finally dropping the prefix for the "specification version" and
> > > > > calling it Avro 10.x
> > > > >
> > > > > 2) committing to semantic versioning for future releases
> > > > >
> > > > > I can see this being a hugely positive move for aligning with the
> > > > > expectations of developers and projects... but it leads to a lot of
> > > > > questions about releasing all the artifacts together.
> > > > >
> > > > > There's already a JIRA:
> > > https://issues.apache.org/jira/browse/AVRO-2687
> > > > >
> > > > > Ryan
> > > > >
> > > > > On Fri, Sep 13, 2019 at 12:00 PM Driesprong, Fokko
> > > <fo...@driesprong.frl>
> > > > > wrote:
> > > > > >
> > > > > > Thanks Sean for bringing this up.
> > > > > >
> > > > > > For the 1.9 branch there were some incompatible changes in the
> API
> > > with
> > > > > > respect to 1.8.2. We've removed Jackson
> > > > > > <https://github.com/apache/avro/pull/135> and Netty from the
> public
> > > API.
> > > > > > This is actually breaking some of the builds
> > > > > > <https://github.com/apache/incubator-iceberg/pull/297>, so,
> > > > > unfortunately,
> > > > > > it isn't compatible, and therefore the major version bump.
> > > > > >
> > > > > > The 1.9.x branch still has support for the Joda time library, but
> > > > > defaults
> > > > > > to jsr310, but is still compatible (I believe). For 1.10 the
> plan is
> > > to
> > > > > > completely remove Joda from the codebase since it is officially
> > > > > deprecated
> > > > > > in favor of Java8 time (jsr310). A lot of this stuff is just
> changes
> > > to
> > > > > the
> > > > > > Java API of Avro, which mostly involves changes to the
> LogicalTypes,
> > > so
> > > > > the
> > > > > > actual format is still compatible (as it should).
> > > > > >
> > > > > > I agree with you Sean, that a lot of the changes that are
> targeted
> > > for
> > > > > 1.10
> > > > > > could be cherry-picked back to the 1.9 branch. If someone is
> willing
> > > to
> > > > > do
> > > > > > this, I would be grateful. However, maintaining a lot of
> different
> > > > > branches
> > > > > > is quite time-consuming in terms of release management of the
> > > different
> > > > > > versions. For Apache Avro 1.9.0 we actually had some regression
> bugs
> > > > > which
> > > > > > were blocking, therefore the 1.9.1 release.
> > > > > >
> > > > > > Personally I don't have big objection on bumping the major
> version if
> > > > > there
> > > > > > are breaking changes to one of the API's. But a big +1 on having
> a
> > > > > > standardized approach on the versioning, this also includes a
> more
> > > clear
> > > > > > approach on documenting the upgrade process and a better
> changelog.
> > > I've
> > > > > > added summaries of the releases a the Github releases:
> > > > > > https://github.com/apache/avro/releases but I think having this
> on
> > > the
> > > > > Avro
> > > > > > website might be more appropriate.
> > > > > >
> > > > > > Cheers, Fokko Driesprong
> > > > > >
> > > > > >
> > > > > >
> > > > > > Op wo 11 sep. 2019 om 18:17 schreef Ryan Blue
> > > <rb...@netflix.com.invalid
> > > > > >:
> > > > > >
> > > > > > > > What would it look like if we *did* have to make an
> incompatible
> > > data
> > > > > > > format change after adopting "conventional" library version
> > > strings?
> > > > > > >
> > > > > > > Let's call these format v1 and v2. The library must produce v1
> by
> > > > > default,
> > > > > > > so it's a matter of having support for writing v2. When the
> default
> > > > > > > changes to v2, then that behavior change would require a major
> > > version
> > > > > > > increase to signal changes to compatibility. I think we would
> also
> > > want
> > > > > > > clear documentation for each version that shows what versions
> of
> > > the
> > > > > format
> > > > > > > it can read, write, and what it will use by default. A table
> on the
> > > > > site
> > > > > > > would work.
> > > > > > >
> > > > > > > On Tue, Sep 10, 2019 at 2:51 PM Sean Busbey <bus...@apache.org
> >
> > > wrote:
> > > > > > >
> > > > > > > > What would it look like if we *did* have to make an
> incompatible
> > > data
> > > > > > > > format change after adopting "conventional" library version
> > > strings?
> > > > > > > >
> > > > > > > > What if we version the specification independent from the
> > > libraries
> > > > > > > > and then have the docs for the libraries claim spec version
> > > > > > > > compatibility?
> > > > > > > >
> > > > > > > > On Tue, Sep 10, 2019 at 3:55 PM Ryan Blue
> > > <rb...@netflix.com.invalid
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > +1 for changing the version strings to follow a more
> standard
> > > > > > > convention.
> > > > > > > > > We don't have any breaking format changes, so I think it is
> > > > > expected
> > > > > > > that
> > > > > > > > > the format compatibility version won't change.
> > > > > > > > >
> > > > > > > > > On Tue, Sep 10, 2019 at 7:28 AM Sean Busbey <
> bus...@apache.org
> > > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi folks!
> > > > > > > > > >
> > > > > > > > > > historically, Avro version numbers have had the form:
> > > > > > > > > >
> > > > > > > > > > <data compatibility> . <major library version> . <minor
> > > library
> > > > > > > > version>
> > > > > > > > > >
> > > > > > > > > > That is, the first number says wether or not we expect
> data
> > > > > > > > > > serialization to be compatible, and the second to say
> wether
> > > we
> > > > > > > expect
> > > > > > > > > > some library will be backwards incompatible however
> that's
> > > > > defined
> > > > > > > for
> > > > > > > > > > the library's language. For example, in the Java library
> > > when we
> > > > > make
> > > > > > > > > > changes to public method signatures such that folks can't
> > > just
> > > > > swap
> > > > > > > > > > out jar files of our implementation.
> > > > > > > > > >
> > > > > > > > > > While getting myself up to speed on the state of our
> release
> > > > > lines, I
> > > > > > > > > > noticed we already have the 1.9 release line in a branch,
> > > with
> > > > > master
> > > > > > > > > > set up for the next major library version. JIRA shows ~46
> > > issues
> > > > > that
> > > > > > > > > > are in 1.10 but not in a 1.9 release[1].
> > > > > > > > > >
> > > > > > > > > > I haven't looked at all of them yet, but the few I
> sampled
> > > don't
> > > > > see
> > > > > > > > > > to require a major version increment.
> > > > > > > > > >
> > > > > > > > > > I looked around our site and I also can't find anywhere
> that
> > > > > we've
> > > > > > > > > > documented our version strings. I know I've been in
> > > discussions
> > > > > in
> > > > > > > > > > other communities where our version strings have been
> > > surprising.
> > > > > > > e.g.
> > > > > > > > > > folks had assumed they can do a low-effort upgrade from
> 1.7
> > > to
> > > > > 1.8
> > > > > > > > > > only to find that there were documented
> incompatibilities and
> > > > > > > behavior
> > > > > > > > > > changes.
> > > > > > > > > >
> > > > > > > > > > Are we actively planning on rolling out 1.10? (like, do
> we
> > > have a
> > > > > > > goal
> > > > > > > > > > date?)
> > > > > > > > > >
> > > > > > > > > > I know that when 1.9 went out we EOLed 1.7 and 1.8 in
> part
> > > due
> > > > > to the
> > > > > > > > > > overhead of trying to maintain multiple release lines
> > > (especially
> > > > > > > once
> > > > > > > > > > that had so much baggage) while we're trying to
> reestablish
> > > good
> > > > > > > > > > habits on release cadence. How many major version are we
> > > > > planning to
> > > > > > > > > > keep going once 1.10 is ready?
> > > > > > > > > >
> > > > > > > > > > What do folks think about starting a CONTRIBUTING.md with
> > > some of
> > > > > > > > > > these expectations? Is there a better place to track it?
> > > > > > > > > >
> > > > > > > > > > [1] : https://s.apache.org/71yqv
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Ryan Blue
> > > > > > > > > Software Engineer
> > > > > > > > > Netflix
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Ryan Blue
> > > > > > > Software Engineer
> > > > > > > Netflix
> > > > > > >
> > > > >
> > >
>


-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to