> I'm assuming the idea is that the existing integration tests will remain in > apache/arrow. Will you also run the integration test suites on your rust > repository CI checks?
Furthermore, against what version will these tests run? * If Arrow runs against the latest release of Rust then it will lag behind and issues may be detected later. * If Arrow runs against the latest nightly of Rust then things will get tricky at release time (all Arrow integrations tests pass but Rust isn't ready to cut a new release and Arrow tests fail against the latest released Rust). Assuming Rust is also running integration tests against Arrow (probably a good idea) you get a similar problem (this one might be trickier given the relative frequencies)... * If Rust runs against the latest release of Arrow then it will lag behind (several months). There will be a "catching up" period after Arrow releases. * If Rust runs against the latest nightly of Arrow the how will Rust release without a new Arrow release? Note, these problems technically exist now with the concept that any language can release a patch at any time. Also, since Rust isn't directly compiling against other Arrow libs and we are only talking about interoperability it's probably not going to be too big of a deal. Still, worth giving some thought ahead of time. On Fri, Apr 9, 2021 at 1:11 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > > > > > With this explanation do you still have a concern? There is no suggestion > > of making releases that depend on GitHub hashes. > > No, I don't think so. IIUC you are saying the crates dependency does not > imply the crate artifacts are published elsewhere. This sounds inline with > policies to me. For some reason I thought the notion of crates implied > publishing to Rusts package management system. > > On Fri, Apr 9, 2021 at 4:07 PM Andy Grove <andygrov...@gmail.com> wrote: > > > Hi Micah, > > > > During development, the Rust crates have local dependencies on each other > > based on relative file system paths. At release time, we change these to > > versioned dependencies before publishing, because it isn't possible to > > publish a crate that depends on non-published crates. > > > > With the code in separate repositories, we would still need an equivalent > > mechanism for DataFusion to use the Arrow code that is under development > > but we would point to a GitHub hash rather than a relative path. We should > > still update to use versioned dependencies when releasing. > > > > I will revise the text in the document to better explain what this means. > > > > With this explanation do you still have a concern? There is no suggestion > > of making releases that depend on GitHub hashes. > > > > Thanks, > > > > Andy. > > > > > > > > On Fri, Apr 9, 2021 at 4:57 PM Micah Kornfield <emkornfi...@gmail.com> > > wrote: > > > >> > > >> > " Crates can depend on GitHub commit hashes between releases" > >> > >> > >> This sounds like it might not align with ASF release policies [1]. > >> > >> [1] https://www.apache.org/legal/release-policy.html#release-definition > >> > >> On Fri, Apr 9, 2021 at 1:34 PM Neal Richardson < > >> neal.p.richard...@gmail.com> > >> wrote: > >> > >> > Thanks, Andy. Two areas of concern I think we should have some answer > >> for > >> > before going forward with this (and I make no opinions as to what the > >> > "right" answers are, just raising them for discussion): > >> > > >> > 1. Integration testing: what is our workflow for ensuring that our > >> > implementations are integration tested, and what do we do when changes > >> > (whether in apache/arrow or in apache/arrow-rs) introduce > >> > regressions/failures? I'm assuming the idea is that the existing > >> > integration tests will remain in apache/arrow. Will you also run the > >> > integration test suites on your rust repository CI checks? > >> > 2. Versioning: one rationale from our current policy of "everyone > >> releases > >> > together" is that you don't have to guess as much whether (for example) > >> > Arrow Java 3.0 and Arrow Rust 3.0 are compatible and using the same > >> format. > >> > It's kind of a heuristic for what library versions were integration > >> tested > >> > with each other. It sounds like (but maybe I misunderstand) that y'all > >> are > >> > looking to break from that. But if Arrow C++ goes to version 7.0 by the > >> end > >> > of the year and arrow-rs chooses to go to 15.4, or 3.12, or whatever, > >> does > >> > that create confusion or doubt that works against the Arrow goal of easy > >> > interoperability? > >> > > >> > Neal > >> > > >> > On Fri, Apr 9, 2021 at 8:18 AM Andy Grove <andygrov...@gmail.com> > >> wrote: > >> > > >> > > Following on from the email thread "Rust sync meeting" I would like to > >> > > start a new discussion about moving the Rust components out to new > >> GitHub > >> > > repositories and using a new process for issues and release > >> management. > >> > > > >> > > I have started a Google document [1] with details and to track the > >> work > >> > > required for this effort but I will summarize the key points of the > >> > > proposal here: > >> > > > >> > > > >> > > - > >> > > > >> > > Move existing Rust code into two new repositories > >> > > - > >> > > > >> > > apache/arrow-rs > >> > > - > >> > > > >> > > Arrow + Parquet crates > >> > > - > >> > > > >> > > apache/datafusion > >> > > - > >> > > > >> > > DataFusion + Ballista crates (which are expected to merge to > >> > some > >> > > degree over time) > >> > > - > >> > > > >> > > TPC-H benchmarks > >> > > - > >> > > > >> > > Use GitHub issues for issue tracking > >> > > - > >> > > > >> > > Decouple release process > >> > > - > >> > > > >> > > Crates are released individually > >> > > - > >> > > > >> > > A vote on the source release of the released crate is held over > >> the > >> > > mailing list as usual. > >> > > - > >> > > > >> > > Rust does not need to release a new version when the rest of > >> Arrow > >> > > releases; we bundle our latest released crates to the signed > >> tar. > >> > > - > >> > > > >> > > Crates can depend on GitHub commit hashes between releases > >> > > > >> > > > >> > > The Google document may be the best place to collaborate on the > >> proposal > >> > > but I can update the document based on any comments in this email > >> thread > >> > as > >> > > well. > >> > > > >> > > Note that I have excluded discussion about arrow2/parquet2 from this > >> > > proposal and I believe we should discuss that separately as a > >> follow-on > >> > > discussion. > >> > > > >> > > I look forward to hearing opinions on this both from current Rust > >> > > maintainers and contributors and also from the wider Arrow community. > >> > > > >> > > Thanks, > >> > > > >> > > Andy. > >> > > > >> > > [1] > >> > > > >> > > > >> > > >> https://docs.google.com/document/d/1TyrUP8_UWXqk97a8Hvb1d0UYWigch0HAephIjW7soSI/edit?usp=sharing > >> > > > >> > > >> > >