Hi,

Wrt to integration tests, I agree that it is important to have a plan prior
to this.

What we have been doing in the apache/arrow:

1. only release if integration tests pass against each other
2. release the signed tar with the latest of every implementation (i.e.
master)

My suggestion for independent versioning:

CI:

* in rust, run integration tests against the latest apache/master on every
PR
* in apache/arrow, run integration tests against the latest released rust
version

Release mechanism:

1. an arrow crate can only be released if it passes integration tests
against the current latest apache/arrow master
2. apache/arrow master can release if their integration tests pass against
the latest released rust crate

The common scenario is that the integration tests in apache/arrow against
Rust pass, and thus
apache/arrow would just need to bundle the latest rust release.

If tests in apache/arrow fail, then some change in apache/arrow
caused our latest release to stop integrating (since we integration-tested
that version against master prior to our release).
This implies that a current Rust release is out of spec and we thus must
release a patch
asap to correct for this (just like we would need to push a commit to
apache/arrow asap).
Once that patch is released, apache/arrow becomes green again and
apache/arrow can bundle these on the signed apache arrow release.

In the unlikely event that the latest release is unable to pass integration
tests *and* despite the best efforts Rust is unable to release a patch in
time, we *may* still bundle a previous release of the Rust crate, thereby
not blocking the whole
release (i.e. this allows us to fall back to a previous release without a
mass revert on the apache/arrow repo).

> * If Rust runs against the latest nightly of Arrow the how will Rust
release without a new Arrow release?

Not sure if this answers, but Rust does not compile or link against any
implementation, so there are
no ABI contracts. Its "only" contract is the spec (in-memory, IPC, flight,
C data interface, etc).

A related point is that when we release a Rust version, we can upload
"integration test artifacts" separately (the same binaries that we
currently use in our integration
tests or a docker image with them), that apache/arrow can use to run
integration tests.
This would allow our CI at apache/arrow to download these artifacts and run
tests as usual via archery and CLI,
without having to compile them. This would alleviate some of the challenges
around integration testing whereby every implementation is currently built
on every run and in sequence.

If someone thinks that it is useful, I would be happy to open a JIRA on
this and draft a google docs
to work out a technical design.

Best,
Jorge


On Sat, Apr 10, 2021 at 1:57 AM Weston Pace <weston.p...@gmail.com> wrote:

> > I'm assuming the idea is that the existing integration tests will remain
> in apache/arrow. Will you also run the integration test suites on your rust
> repository CI checks?
>
> Furthermore, against what version will these tests run?
>
> * If Arrow runs against the latest release of Rust then it will lag
> behind and issues may be detected later.
> * If Arrow runs against the latest nightly of Rust then things will
> get tricky at release time (all Arrow integrations tests pass but Rust
> isn't ready to cut a new release and Arrow tests fail against the
> latest released Rust).
>
> Assuming Rust is also running integration tests against Arrow
> (probably a good idea) you get a similar problem (this one might be
> trickier given the relative frequencies)...
>
> * If Rust runs against the latest release of Arrow then it will lag
> behind (several months).  There will be a "catching up" period after
> Arrow releases.
> * If Rust runs against the latest nightly of Arrow the how will Rust
> release without a new Arrow release?
>
> Note, these problems technically exist now with the concept that any
> language can release a patch at any time.  Also, since Rust isn't
> directly compiling against other Arrow libs and we are only talking
> about interoperability it's probably not going to be too big of a
> deal.  Still, worth giving some thought ahead of time.
>
> On Fri, Apr 9, 2021 at 1:11 PM Micah Kornfield <emkornfi...@gmail.com>
> wrote:
> >
> > >
> > > With this explanation do you still have a concern? There is no
> suggestion
> > > of making releases that depend on GitHub hashes.
> >
> > No, I don't think so.  IIUC you are saying the crates dependency does not
> > imply the crate artifacts are published elsewhere.  This sounds inline
> with
> > policies to me.  For some reason I thought the notion of crates implied
> > publishing to Rusts package management system.
> >
> > On Fri, Apr 9, 2021 at 4:07 PM Andy Grove <andygrov...@gmail.com> wrote:
> >
> > > Hi Micah,
> > >
> > > During development, the Rust crates have local dependencies on each
> other
> > > based on relative file system paths. At release time, we change these
> to
> > > versioned dependencies before publishing, because it isn't possible to
> > > publish a crate that depends on non-published crates.
> > >
> > > With the code in separate repositories, we would still need an
> equivalent
> > > mechanism for DataFusion to use the Arrow code that is under
> development
> > > but we would point to a GitHub hash rather than a relative path. We
> should
> > > still update to use versioned dependencies when releasing.
> > >
> > > I will revise the text in the document to better explain what this
> means.
> > >
> > > With this explanation do you still have a concern? There is no
> suggestion
> > > of making releases that depend on GitHub hashes.
> > >
> > > Thanks,
> > >
> > > Andy.
> > >
> > >
> > >
> > > On Fri, Apr 9, 2021 at 4:57 PM Micah Kornfield <emkornfi...@gmail.com>
> > > wrote:
> > >
> > >> >
> > >> > " Crates can depend on GitHub commit hashes between releases"
> > >>
> > >>
> > >> This sounds  like it might not align with ASF release policies [1].
> > >>
> > >> [1]
> https://www.apache.org/legal/release-policy.html#release-definition
> > >>
> > >> On Fri, Apr 9, 2021 at 1:34 PM Neal Richardson <
> > >> neal.p.richard...@gmail.com>
> > >> wrote:
> > >>
> > >> > Thanks, Andy. Two areas of concern I think we should have some
> answer
> > >> for
> > >> > before going forward with this (and I make no opinions as to what
> the
> > >> > "right" answers are, just raising them for discussion):
> > >> >
> > >> > 1. Integration testing: what is our workflow for ensuring that our
> > >> > implementations are integration tested, and what do we do when
> changes
> > >> > (whether in apache/arrow or in apache/arrow-rs) introduce
> > >> > regressions/failures? I'm assuming the idea is that the existing
> > >> > integration tests will remain in apache/arrow. Will you also run the
> > >> > integration test suites on your rust repository CI checks?
> > >> > 2. Versioning: one rationale from our current policy of "everyone
> > >> releases
> > >> > together" is that you don't have to guess as much whether (for
> example)
> > >> > Arrow Java 3.0 and Arrow Rust 3.0 are compatible and using the same
> > >> format.
> > >> > It's kind of a heuristic for what library versions were integration
> > >> tested
> > >> > with each other. It sounds like (but maybe I misunderstand) that
> y'all
> > >> are
> > >> > looking to break from that. But if Arrow C++ goes to version 7.0 by
> the
> > >> end
> > >> > of the year and arrow-rs chooses to go to 15.4, or 3.12, or
> whatever,
> > >> does
> > >> > that create confusion or doubt that works against the Arrow goal of
> easy
> > >> > interoperability?
> > >> >
> > >> > Neal
> > >> >
> > >> > On Fri, Apr 9, 2021 at 8:18 AM Andy Grove <andygrov...@gmail.com>
> > >> wrote:
> > >> >
> > >> > > Following on from the email thread "Rust sync meeting" I would
> like to
> > >> > > start a new discussion about moving the Rust components out to new
> > >> GitHub
> > >> > > repositories and using a new process for issues and release
> > >> management.
> > >> > >
> > >> > > I have started a Google document [1] with details and to track the
> > >> work
> > >> > > required for this effort but I will summarize the key points of
> the
> > >> > > proposal here:
> > >> > >
> > >> > >
> > >> > >    -
> > >> > >
> > >> > >    Move existing Rust code into two new repositories
> > >> > >    -
> > >> > >
> > >> > >       apache/arrow-rs
> > >> > >       -
> > >> > >
> > >> > >          Arrow + Parquet crates
> > >> > >          -
> > >> > >
> > >> > >       apache/datafusion
> > >> > >       -
> > >> > >
> > >> > >          DataFusion + Ballista crates (which are expected to
> merge to
> > >> > some
> > >> > >          degree over time)
> > >> > >          -
> > >> > >
> > >> > >          TPC-H benchmarks
> > >> > >          -
> > >> > >
> > >> > >       Use GitHub issues for issue tracking
> > >> > >       -
> > >> > >
> > >> > >    Decouple release process
> > >> > >    -
> > >> > >
> > >> > >       Crates are released individually
> > >> > >       -
> > >> > >
> > >> > >       A vote on the source release of the released crate is held
> over
> > >> the
> > >> > >       mailing list as usual.
> > >> > >       -
> > >> > >
> > >> > >       Rust does not need to release a new version when the rest of
> > >> Arrow
> > >> > >       releases; we bundle our latest released crates to the signed
> > >> tar.
> > >> > >       -
> > >> > >
> > >> > >       Crates can depend on GitHub commit hashes between releases
> > >> > >
> > >> > >
> > >> > > The Google document may be the best place to collaborate on the
> > >> proposal
> > >> > > but I can update the document based on any comments in this email
> > >> thread
> > >> > as
> > >> > > well.
> > >> > >
> > >> > > Note that I have excluded discussion about arrow2/parquet2 from
> this
> > >> > > proposal and I believe we should discuss that separately as a
> > >> follow-on
> > >> > > discussion.
> > >> > >
> > >> > > I look forward to hearing opinions on this both from current Rust
> > >> > > maintainers and contributors and also from the wider Arrow
> community.
> > >> > >
> > >> > > Thanks,
> > >> > >
> > >> > > Andy.
> > >> > >
> > >> > > [1]
> > >> > >
> > >> > >
> > >> >
> > >>
> https://docs.google.com/document/d/1TyrUP8_UWXqk97a8Hvb1d0UYWigch0HAephIjW7soSI/edit?usp=sharing
> > >> > >
> > >> >
> > >>
> > >
>

Reply via email to