Hi, Looks reasonable. I'll start voting next week.
Thanks, -- kou In <cahdp0gcleiyo1879x+eu1vpjmrhsiecvjlc1img29tvdyna...@mail.gmail.com> "Re: [DISCUSS][Julia] How to restart at apache/arrow-julia?" on Mon, 20 Sep 2021 23:25:54 -0700, QP Hou <houqp....@gmail.com> wrote: > To expedite the donation, perhaps we could move on with the decoupled > version scheme for now to reduce workload and disruption to the > existing users. The julia maintainers can always decide to change the > versioning scheme later after the donation has been completed. This > doesn't seem like a blocker issue to me. > > On Mon, Sep 20, 2021 at 8:09 PM Sutou Kouhei <k...@clear-code.com> wrote: >> >> Hi Jacob, >> >> Thanks for confirming this. >> >> For major release: >> >> As far as I know: >> >> We chose this style because we will develop actively in at >> least a few years. Active development will need API breaking >> changes. So we release a major version per 3-4 months. >> >> Our release process releases all implementations at once >> before we chose this style. We just didn't change it. Some >> implementations don't have API breaking changes between >> major releases. But we just don't care it. >> >> Aligned versions for all implementations may have a merit >> for users. Users can assume that it's safe that they use >> Apache Arrow C++ 6.0.0 and Apache Arrow Rust 6.0.0. (We have >> integration tests for implementations with the same version.) >> >> References: >> >> * Discussion: [Discuss] Compatibility Guarantees and Versioning Post >> "1.0.0" >> >> https://lists.apache.org/thread.html/5715a4d402c835d22d929a8069c5c0cf232077a660ee98639d544af8%40%3Cdev.arrow.apache.org%3E >> >> * Vote: [VOTE] Adopt FORMAT and LIBRARY SemVer-based version schemes for >> Arrow 1.0.0 and beyond >> >> https://lists.apache.org/thread.html/2a630234214e590eb184c24bbf9dac4a8d8f7677d85a75fa49d70ba8%40%3Cdev.arrow.apache.org%3E >> >> * Follow-up thread: Versioning of arrow >> >> https://lists.apache.org/thread.html/rb11c0839a7167c2f1d82b0b77134c53abc5487e9165c3493b55db12b%40%3Cdev.arrow.apache.org%3E >> >> >> My opinion: >> >> I have no opinion on this. I don't object that the Julia >> implementation uses separated version. >> >> >> Thanks, >> -- >> kou >> >> In <cakyxbqrc2ht3jt67zc-g95_rkjehmhhi_demm9b34p+1n_c...@mail.gmail.com> >> "Re: [DISCUSS][Julia] How to restart at apache/arrow-julia?" on Thu, 16 >> Sep 2021 23:47:45 -0600, >> Jacob Quinn <quinn.jac...@gmail.com> wrote: >> >> > Good question. >> > >> > In my mind, I was imagining the arrow-julia repo would have a fully >> > decoupled versioning from the main arrow project. This comes from my >> > understanding that the julia implementation is it's own "project" that >> > implements the arrow spec/format, and we may need a breaking major release >> > at different cadences than the main spec version. Indeed, while the arrow >> > project has gone from 2.0 -> 6.0 since the julia implementation was first >> > released, we're just now releasing our own 2.0.0 version after a change in >> > API for how metadata is set/retrieved on table/column objects. >> > >> > I'll admit that it's not entirely clear to me how to best signal/implement >> > coordination between the main arrow project versions and the julia version >> > though. I'm just guessing here, but is that why the main arrow project does >> > so frequent major version releases? To account for any child >> > implementations happening to have breaking changes? I think I remember >> > discussion recently around moving the actual spec/format document out as a >> > separate repo or at least versioning it separately from all the various >> > implementations, and that seems like it would be a good idea, though I >> > guess the format itself has versioning builtin to itself. It's certainly >> > something we can clarify in the Julia package itself; i.e. which version of >> > the spec a given Julia package version is compatible with. Typically with >> > other julia package dependencies, just a minor version increment is >> > required when a new breaking dependency version is upgraded, so I would >> > think we could follow something similar by treating the arrow format as a >> > "dependency". >> > >> > I'll clarify that I don't feel very strongly on these points, so if there's >> > something I'm missing or gaps in my understanding of how the rest of the >> > web of projects are coordinating things, I'm all ears. >> > >> > -Jacob >> > >> > On Thu, Sep 16, 2021 at 11:24 PM Sutou Kouhei <k...@clear-code.com> wrote: >> > >> >> Hi, >> >> >> >> Good point! Jacob, could you confirm this? >> >> >> >> >> >> Thanks, >> >> -- >> >> kou >> >> >> >> In <caabjhxhtk6na+cr-5lwtvmqprz7ckznf6cpesn6xkthgkzh...@mail.gmail.com> >> >> "Re: [DISCUSS][Julia] How to restart at apache/arrow-julia?" on Sat, 11 >> >> Sep 2021 16:57:17 -0700, >> >> QP Hou <q...@scribd.com.INVALID> wrote: >> >> >> >> > Just one minor point to confirm and clarify. It looks like Julia arrow >> >> only >> >> > wants to do on demand minor and patch releases. Major version release >> >> still >> >> > needs to be aligned with the main arrow release schedule, is that >> >> correct? >> >> > In other words, breaking changes should be avoided in on demand releases >> >> > (assuming they are using semantic versioning). >> >> > >> >> > From the original julia donation thread, I got the impression that the >> >> > julia maintainers wanted to have their own versioning scheme. Maybe >> >> that’s >> >> > not the case anymore. So I wanted to make sure we set the right >> >> expectation >> >> > for Julia maintainers. >> >> > >> >> > FWIW, Arrow-rs today aligns the major version with the main arrow >> >> release, >> >> > so Andrew spend quite a bit of time maintaining an active release branch >> >> to >> >> > backport backwards compatible commits for minor and patch releases. >> >> > Datadusion and ballista on the other hand has a versioning scheme that’s >> >> > fully decoupled from the main Arrow version including the major version. >> >> > >> >> > On Thu, Sep 9, 2021 at 1:38 PM Sutou Kouhei <k...@clear-code.com> wrote: >> >> > >> >> >> Hi, >> >> >> >> >> >> Thanks for all comments about release schedule. >> >> >> >> >> >> Let's use release-on-demand approach based on >> >> >> arrow-datafusion's flow for the Julia Arrow implementation. >> >> >> >> >> >> Do we have more items to be discussed? Can we start voting? >> >> >> >> >> >> >> >> >> Thanks, >> >> >> -- >> >> >> kou >> >> >> >> >> >> In <cafhtnrxafz+q43yjqfsssn12hb_94jzfprxu6cdnnkwaoje...@mail.gmail.com> >> >> >> "Re: [DISCUSS][Julia] How to restart at apache/arrow-julia?" on Thu, >> >> >> 9 >> >> >> Sep 2021 09:48:57 -0400, >> >> >> Andrew Lamb <al...@influxdata.com> wrote: >> >> >> >> >> >> > I also think release on demand is a good strategy. >> >> >> > >> >> >> > The primary reasons to do an arrow-rs release every 2 weeks were: >> >> >> > 1. To have predictable cadence into downstream projects (e.g. >> >> datafusion >> >> >> > and others) >> >> >> > 2. Amortize the overhead associated with each release (the process is >> >> non >> >> >> > trivial and the current 72 hour voting window adds some backpressure >> >> as >> >> >> > well -- I remember Wes may have said windows shorter than 72 hours >> >> might >> >> >> be >> >> >> > fine too) >> >> >> > >> >> >> > >> >> >> > On Wed, Sep 8, 2021 at 12:19 AM QP Hou <q...@scribd.com.invalid> >> >> wrote: >> >> >> > >> >> >> >> A minor note on the Rust side of things. arrow-rs has a 2 weeks >> >> >> >> release cycle, but arrow-datafusion mostly does release on demand at >> >> >> >> the moment. Our most uptodate release processes are documented at >> >> >> >> [1] >> >> >> >> and [2]. >> >> >> >> >> >> >> >> [1]: >> >> >> https://github.com/apache/arrow-rs/blob/master/dev/release/README.md >> >> >> >> [2]: >> >> >> >> >> >> >> >> >> https://github.com/apache/arrow-datafusion/blob/master/dev/release/README.md >> >> >> >> >> >> >> >> On Tue, Sep 7, 2021 at 4:01 PM Jacob Quinn <quinn.jac...@gmail.com> >> >> >> wrote: >> >> >> >> > >> >> >> >> > Thanks kou. >> >> >> >> > >> >> >> >> > I think the TODO action list looks good. >> >> >> >> > >> >> >> >> > The one point I think could use some additional discussion is >> >> around >> >> >> the >> >> >> >> > release cadence: it IS desirable to be able to release more >> >> frequently >> >> >> >> than >> >> >> >> > the parent repo 3-4 month cadence. But we also haven't had the >> >> >> frequency >> >> >> >> of >> >> >> >> > commits to necessarily warrant a release every 2 weeks. I can >> >> think of >> >> >> >> two >> >> >> >> > possible options, not sure if one or the other would be more >> >> >> compatible >> >> >> >> > with the apache release process: >> >> >> >> > >> >> >> >> > 1) Allow for release-on-demand; this is idiomatic for most Julia >> >> >> packages >> >> >> >> > I'm aware of. When a particular bug is fixed, or feature added, a >> >> user >> >> >> >> can >> >> >> >> > request a release, a little discussion happens, and a new release >> >> is >> >> >> >> made. >> >> >> >> > This approach would work well for the "bursty" kind of >> >> contributions >> >> >> >> we've >> >> >> >> > seen to Arrow.jl where development by certain people will happen >> >> >> >> frequently >> >> >> >> > for a while, then take a break for other things. This also avoids >> >> >> having >> >> >> >> > "scheduled" releases (every 2 weeks, 3 months, etc.) where there >> >> >> hasn't >> >> >> >> > been significant updates to necessarily warrant a new release. >> >> >> >> > This >> >> >> >> > approach may also facilitate differentiating between bugfix >> >> >> >> > (patch) >> >> >> >> > releases vs. new functionality releases (minor), since when a >> >> release >> >> >> is >> >> >> >> > requested, it could be specified whether it should be patch or >> >> minor >> >> >> (or >> >> >> >> > major). >> >> >> >> > >> >> >> >> > 2) Commit to a scheduled release pattern like every 2 weeks, once >> >> >> >> > a >> >> >> >> month, >> >> >> >> > etc. This has the advantage of consistency and clearer >> >> >> >> > expectations >> >> >> for >> >> >> >> > users/devs involved. A release also doesn't need to be requested, >> >> >> because >> >> >> >> > we can just wait for the scheduled time to release. In terms of >> >> >> >> > the >> >> >> >> > "unnecessary releases" mentioned above, it could be as simple as >> >> >> >> > "cancelling" a release if there hasn't been significant updates in >> >> the >> >> >> >> > elapsed time period. >> >> >> >> > >> >> >> >> > My preference would be for 1), but that's influenced from what I'm >> >> >> >> familiar >> >> >> >> > with in the Julia package ecosystem. It seems like it would still >> >> fit >> >> >> in >> >> >> >> > the apache way since we would formally request a new release, wait >> >> the >> >> >> >> > elapsed amount of time for voting (24 hours would be preferrable), >> >> >> then >> >> >> >> at >> >> >> >> > the end of the voting period, a new release could be made. >> >> >> >> > >> >> >> >> > Thanks again kou for helping support the Julia implementation >> >> >> >> > here. >> >> >> >> > >> >> >> >> > -Jacob >> >> >> >> > >> >> >> >> > 2) >> >> >> >> > >> >> >> >> > On Sun, Sep 5, 2021 at 3:25 PM Sutou Kouhei <k...@clear-code.com> >> >> >> wrote: >> >> >> >> > >> >> >> >> > > Hi, >> >> >> >> > > >> >> >> >> > > Sorry for the delay. This is a continuation of the "Status >> >> >> >> > > of Arrow Julia implementation?" thread: >> >> >> >> > > >> >> >> >> > > >> >> >> >> > > >> >> >> >> >> >> >> >> >> https://lists.apache.org/x/thread.html/r6d91286686d92837fbe21dd042801a57e3a7b00b5903ea90a754ac7b%40%3Cdev.arrow.apache.org%3E >> >> >> >> > > >> >> >> >> > > I summarize the current status, the next actions and items >> >> >> >> > > to be discussed. >> >> >> >> > > >> >> >> >> > > The current status: >> >> >> >> > > >> >> >> >> > > * The Julia Arrow implementation uses >> >> >> >> > > https://github.com/JuliaData/Arrow.jl as a "dev branch" >> >> >> >> > > instead of creating a branch in >> >> >> >> > > https://github.com/apache/arrow >> >> >> >> > > * The Julia Arrow implementation wants to use GitHub >> >> >> >> > > for the main issue management platform >> >> >> >> > > * The Julia Arrow implementation wants to release >> >> >> >> > > more frequency than 1 release per 3-4 months >> >> >> >> > > * The current workflow of the Rust Arrow implementation >> >> >> >> > > will also fit the Julia Arrow implementation >> >> >> >> > > >> >> >> >> > > The current workflow of the Rust Arrow implementation: >> >> >> >> > > >> >> >> >> > > >> >> >> >> > > >> >> >> >> >> >> >> >> >> https://docs.google.com/document/d/1TyrUP8_UWXqk97a8Hvb1d0UYWigch0HAephIjW7soSI/edit#heading=h.kv1hwbhi3cmi >> >> >> >> > > >> >> >> >> > > * Uses apache/arrow-rs and apache/arrow-datafusion instead >> >> >> >> > > of apache/arrow for repository >> >> >> >> > > >> >> >> >> > > * Uses GitHub instead of JIRA for issue management >> >> >> >> > > platform >> >> >> >> > > >> >> >> >> > > >> >> >> >> > > >> >> >> >> >> >> >> >> >> https://docs.google.com/document/d/1tMQ67iu8XyGGZuj--h9WQYB9inCk6c2sL_4xMTwENGc/edit >> >> >> >> > > >> >> >> >> > > * Releases a new minor and patch version every 2 weeks >> >> >> >> > > in addition to the quarterly release of the other releases >> >> >> >> > > >> >> >> >> > > The next actions after we get a consensus about this >> >> >> >> > > discussion: >> >> >> >> > > >> >> >> >> > > 1. Start voting the Julia Arrow implementation move like >> >> >> >> > > the Rust's one: >> >> >> >> > > >> >> >> >> > > >> >> >> >> > > >> >> >> >> >> >> >> >> >> https://lists.apache.org/x/thread.html/r44390a18b3fbb08ddb68aa4d12f37245d948984fae11a41494e5fc1d@%3Cdev.arrow.apache.org%3E >> >> >> >> > > >> >> >> >> > > 2. Create apache/arrow-julia >> >> >> >> > > >> >> >> >> > > 3. Start IP clearance process to import JuliaData/Arrow.jl >> >> >> >> > > to apache/arrow-julia >> >> >> >> > > >> >> >> >> > > (We don't use julia/Arrow/ in apache/arrow.) >> >> >> >> > > >> >> >> >> > > 4. Import JuliaData/Arrow.jl to apache/arrow-julia >> >> >> >> > > >> >> >> >> > > 5. Prepare integration tests CI in apache/arrow-julia and >> >> >> >> apache/arrow >> >> >> >> > > >> >> >> >> > > 6. Prepare releasing tools in apache/arrow-julia and >> >> apache/arrow >> >> >> >> > > >> >> >> >> > > 7. Remove julia/... from apache/arrow and leave >> >> >> >> > > julia/README.md pointing to apache/arrow-julia >> >> >> >> > > >> >> >> >> > > >> >> >> >> > > Items to be discussed: >> >> >> >> > > >> >> >> >> > > * Interval of minor and patch releases >> >> >> >> > > >> >> >> >> > > * The Rust Arrow implementation uses 2 weeks. >> >> >> >> > > >> >> >> >> > > * Does the Julia Arrow implementation also wants to use >> >> >> >> > > 2 weeks? >> >> >> >> > > >> >> >> >> > > * Can we accordance with the Apache way with this workflow >> >> >> >> > > without pain? >> >> >> >> > > >> >> >> >> > > The Rust Arrow implementation workflow includes the >> >> >> >> > > following for this: >> >> >> >> > > >> >> >> >> > > >> >> >> >> > > >> >> >> >> >> >> >> >> >> https://docs.google.com/document/d/1TyrUP8_UWXqk97a8Hvb1d0UYWigch0HAephIjW7soSI/edit#heading=h.kv1hwbhi3cmi >> >> >> >> > > >> >> >> >> > > > Contributors will be required to write issues for >> >> >> >> > > > planned features and bug fixes so that we have >> >> >> >> > > > visibility and opportunities for collaboration >> >> >> >> > > > before a PR shows up. >> >> >> >> > > >> >> >> >> > > * More items? >> >> >> >> > > >> >> >> >> > > >> >> >> >> > > Thanks, >> >> >> >> > > -- >> >> >> >> > > kou >> >> >> >> > > >> >> >> >> >> >> >> >> >> > -- >> >> > Thanks, >> >> > QP Hou >> >>