> I'm assuming the idea is that the existing integration tests will remain in 
> apache/arrow. Will you also run the integration test suites on your rust 
> repository CI checks?

Furthermore, against what version will these tests run?

* If Arrow runs against the latest release of Rust then it will lag
behind and issues may be detected later.
* If Arrow runs against the latest nightly of Rust then things will
get tricky at release time (all Arrow integrations tests pass but Rust
isn't ready to cut a new release and Arrow tests fail against the
latest released Rust).

Assuming Rust is also running integration tests against Arrow
(probably a good idea) you get a similar problem (this one might be
trickier given the relative frequencies)...

* If Rust runs against the latest release of Arrow then it will lag
behind (several months).  There will be a "catching up" period after
Arrow releases.
* If Rust runs against the latest nightly of Arrow the how will Rust
release without a new Arrow release?

Note, these problems technically exist now with the concept that any
language can release a patch at any time.  Also, since Rust isn't
directly compiling against other Arrow libs and we are only talking
about interoperability it's probably not going to be too big of a
deal.  Still, worth giving some thought ahead of time.

On Fri, Apr 9, 2021 at 1:11 PM Micah Kornfield <emkornfi...@gmail.com> wrote:
>
> >
> > With this explanation do you still have a concern? There is no suggestion
> > of making releases that depend on GitHub hashes.
>
> No, I don't think so.  IIUC you are saying the crates dependency does not
> imply the crate artifacts are published elsewhere.  This sounds inline with
> policies to me.  For some reason I thought the notion of crates implied
> publishing to Rusts package management system.
>
> On Fri, Apr 9, 2021 at 4:07 PM Andy Grove <andygrov...@gmail.com> wrote:
>
> > Hi Micah,
> >
> > During development, the Rust crates have local dependencies on each other
> > based on relative file system paths. At release time, we change these to
> > versioned dependencies before publishing, because it isn't possible to
> > publish a crate that depends on non-published crates.
> >
> > With the code in separate repositories, we would still need an equivalent
> > mechanism for DataFusion to use the Arrow code that is under development
> > but we would point to a GitHub hash rather than a relative path. We should
> > still update to use versioned dependencies when releasing.
> >
> > I will revise the text in the document to better explain what this means.
> >
> > With this explanation do you still have a concern? There is no suggestion
> > of making releases that depend on GitHub hashes.
> >
> > Thanks,
> >
> > Andy.
> >
> >
> >
> > On Fri, Apr 9, 2021 at 4:57 PM Micah Kornfield <emkornfi...@gmail.com>
> > wrote:
> >
> >> >
> >> > " Crates can depend on GitHub commit hashes between releases"
> >>
> >>
> >> This sounds  like it might not align with ASF release policies [1].
> >>
> >> [1] https://www.apache.org/legal/release-policy.html#release-definition
> >>
> >> On Fri, Apr 9, 2021 at 1:34 PM Neal Richardson <
> >> neal.p.richard...@gmail.com>
> >> wrote:
> >>
> >> > Thanks, Andy. Two areas of concern I think we should have some answer
> >> for
> >> > before going forward with this (and I make no opinions as to what the
> >> > "right" answers are, just raising them for discussion):
> >> >
> >> > 1. Integration testing: what is our workflow for ensuring that our
> >> > implementations are integration tested, and what do we do when changes
> >> > (whether in apache/arrow or in apache/arrow-rs) introduce
> >> > regressions/failures? I'm assuming the idea is that the existing
> >> > integration tests will remain in apache/arrow. Will you also run the
> >> > integration test suites on your rust repository CI checks?
> >> > 2. Versioning: one rationale from our current policy of "everyone
> >> releases
> >> > together" is that you don't have to guess as much whether (for example)
> >> > Arrow Java 3.0 and Arrow Rust 3.0 are compatible and using the same
> >> format.
> >> > It's kind of a heuristic for what library versions were integration
> >> tested
> >> > with each other. It sounds like (but maybe I misunderstand) that y'all
> >> are
> >> > looking to break from that. But if Arrow C++ goes to version 7.0 by the
> >> end
> >> > of the year and arrow-rs chooses to go to 15.4, or 3.12, or whatever,
> >> does
> >> > that create confusion or doubt that works against the Arrow goal of easy
> >> > interoperability?
> >> >
> >> > Neal
> >> >
> >> > On Fri, Apr 9, 2021 at 8:18 AM Andy Grove <andygrov...@gmail.com>
> >> wrote:
> >> >
> >> > > Following on from the email thread "Rust sync meeting" I would like to
> >> > > start a new discussion about moving the Rust components out to new
> >> GitHub
> >> > > repositories and using a new process for issues and release
> >> management.
> >> > >
> >> > > I have started a Google document [1] with details and to track the
> >> work
> >> > > required for this effort but I will summarize the key points of the
> >> > > proposal here:
> >> > >
> >> > >
> >> > >    -
> >> > >
> >> > >    Move existing Rust code into two new repositories
> >> > >    -
> >> > >
> >> > >       apache/arrow-rs
> >> > >       -
> >> > >
> >> > >          Arrow + Parquet crates
> >> > >          -
> >> > >
> >> > >       apache/datafusion
> >> > >       -
> >> > >
> >> > >          DataFusion + Ballista crates (which are expected to merge to
> >> > some
> >> > >          degree over time)
> >> > >          -
> >> > >
> >> > >          TPC-H benchmarks
> >> > >          -
> >> > >
> >> > >       Use GitHub issues for issue tracking
> >> > >       -
> >> > >
> >> > >    Decouple release process
> >> > >    -
> >> > >
> >> > >       Crates are released individually
> >> > >       -
> >> > >
> >> > >       A vote on the source release of the released crate is held over
> >> the
> >> > >       mailing list as usual.
> >> > >       -
> >> > >
> >> > >       Rust does not need to release a new version when the rest of
> >> Arrow
> >> > >       releases; we bundle our latest released crates to the signed
> >> tar.
> >> > >       -
> >> > >
> >> > >       Crates can depend on GitHub commit hashes between releases
> >> > >
> >> > >
> >> > > The Google document may be the best place to collaborate on the
> >> proposal
> >> > > but I can update the document based on any comments in this email
> >> thread
> >> > as
> >> > > well.
> >> > >
> >> > > Note that I have excluded discussion about arrow2/parquet2 from this
> >> > > proposal and I believe we should discuss that separately as a
> >> follow-on
> >> > > discussion.
> >> > >
> >> > > I look forward to hearing opinions on this both from current Rust
> >> > > maintainers and contributors and also from the wider Arrow community.
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Andy.
> >> > >
> >> > > [1]
> >> > >
> >> > >
> >> >
> >> https://docs.google.com/document/d/1TyrUP8_UWXqk97a8Hvb1d0UYWigch0HAephIjW7soSI/edit?usp=sharing
> >> > >
> >> >
> >>
> >

Reply via email to