Good question.

In my mind, I was imagining the arrow-julia repo would have a fully
decoupled versioning from the main arrow project. This comes from my
understanding that the julia implementation is it's own "project" that
implements the arrow spec/format, and we may need a breaking major release
at different cadences than the main spec version. Indeed, while the arrow
project has gone from 2.0 -> 6.0 since the julia implementation was first
released, we're just now releasing our own 2.0.0 version after a change in
API for how metadata is set/retrieved on table/column objects.

I'll admit that it's not entirely clear to me how to best signal/implement
coordination between the main arrow project versions and the julia version
though. I'm just guessing here, but is that why the main arrow project does
so frequent major version releases? To account for any child
implementations happening to have breaking changes? I think I remember
discussion recently around moving the actual spec/format document out as a
separate repo or at least versioning it separately from all the various
implementations, and that seems like it would be a good idea, though I
guess the format itself has versioning builtin to itself. It's certainly
something we can clarify in the Julia package itself; i.e. which version of
the spec a given Julia package version is compatible with. Typically with
other julia package dependencies, just a minor version increment is
required when a new breaking dependency version is upgraded, so I would
think we could follow something similar by treating the arrow format as a
"dependency".

I'll clarify that I don't feel very strongly on these points, so if there's
something I'm missing or gaps in my understanding of how the rest of the
web of projects are coordinating things, I'm all ears.

-Jacob

On Thu, Sep 16, 2021 at 11:24 PM Sutou Kouhei <k...@clear-code.com> wrote:

> Hi,
>
> Good point! Jacob, could you confirm this?
>
>
> Thanks,
> --
> kou
>
> In <caabjhxhtk6na+cr-5lwtvmqprz7ckznf6cpesn6xkthgkzh...@mail.gmail.com>
>   "Re: [DISCUSS][Julia] How to restart at apache/arrow-julia?" on Sat, 11
> Sep 2021 16:57:17 -0700,
>   QP Hou <q...@scribd.com.INVALID> wrote:
>
> > Just one minor point to confirm and clarify. It looks like Julia arrow
> only
> > wants to do on demand minor and patch releases. Major version release
> still
> > needs to be aligned with the main arrow release schedule, is that
> correct?
> > In other words, breaking changes should be avoided in on demand releases
> > (assuming they are using semantic versioning).
> >
> > From the original julia donation thread, I got the impression that the
> > julia maintainers wanted to have their own versioning scheme. Maybe
> that’s
> > not the case anymore. So I wanted to make sure we set the right
> expectation
> > for Julia maintainers.
> >
> > FWIW, Arrow-rs today aligns the major version with the main arrow
> release,
> > so Andrew spend quite a bit of time maintaining an active release branch
> to
> > backport backwards compatible commits for minor and patch releases.
> > Datadusion and ballista on the other hand has a versioning scheme that’s
> > fully decoupled from the main Arrow version including the major version.
> >
> > On Thu, Sep 9, 2021 at 1:38 PM Sutou Kouhei <k...@clear-code.com> wrote:
> >
> >> Hi,
> >>
> >> Thanks for all comments about release schedule.
> >>
> >> Let's use release-on-demand approach based on
> >> arrow-datafusion's flow for the Julia Arrow implementation.
> >>
> >> Do we have more items to be discussed? Can we start voting?
> >>
> >>
> >> Thanks,
> >> --
> >> kou
> >>
> >> In <cafhtnrxafz+q43yjqfsssn12hb_94jzfprxu6cdnnkwaoje...@mail.gmail.com>
> >>   "Re: [DISCUSS][Julia] How to restart at apache/arrow-julia?" on Thu, 9
> >> Sep 2021 09:48:57 -0400,
> >>   Andrew Lamb <al...@influxdata.com> wrote:
> >>
> >> > I also think release on demand is a good strategy.
> >> >
> >> > The primary reasons to do an arrow-rs release every 2 weeks were:
> >> > 1. To have predictable cadence into downstream projects (e.g.
> datafusion
> >> > and others)
> >> > 2. Amortize the overhead associated with each release (the process is
> non
> >> > trivial and the current 72 hour voting window adds some backpressure
> as
> >> > well -- I remember Wes may have said windows shorter than 72 hours
> might
> >> be
> >> > fine too)
> >> >
> >> >
> >> > On Wed, Sep 8, 2021 at 12:19 AM QP Hou <q...@scribd.com.invalid>
> wrote:
> >> >
> >> >> A minor note on the Rust side of things. arrow-rs has a 2 weeks
> >> >> release cycle, but arrow-datafusion mostly does release on demand at
> >> >> the moment. Our most uptodate release processes are documented at [1]
> >> >> and [2].
> >> >>
> >> >> [1]:
> >> https://github.com/apache/arrow-rs/blob/master/dev/release/README.md
> >> >> [2]:
> >> >>
> >>
> https://github.com/apache/arrow-datafusion/blob/master/dev/release/README.md
> >> >>
> >> >> On Tue, Sep 7, 2021 at 4:01 PM Jacob Quinn <quinn.jac...@gmail.com>
> >> wrote:
> >> >> >
> >> >> > Thanks kou.
> >> >> >
> >> >> > I think the TODO action list looks good.
> >> >> >
> >> >> > The one point I think could use some additional discussion is
> around
> >> the
> >> >> > release cadence: it IS desirable to be able to release more
> frequently
> >> >> than
> >> >> > the parent repo 3-4 month cadence. But we also haven't had the
> >> frequency
> >> >> of
> >> >> > commits to necessarily warrant a release every 2 weeks. I can
> think of
> >> >> two
> >> >> > possible options, not sure if one or the other would be more
> >> compatible
> >> >> > with the apache release process:
> >> >> >
> >> >> > 1) Allow for release-on-demand; this is idiomatic for most Julia
> >> packages
> >> >> > I'm aware of. When a particular bug is fixed, or feature added, a
> user
> >> >> can
> >> >> > request a release, a little discussion happens, and a new release
> is
> >> >> made.
> >> >> > This approach would work well for the "bursty" kind of
> contributions
> >> >> we've
> >> >> > seen to Arrow.jl where development by certain people will happen
> >> >> frequently
> >> >> > for a while, then take a break for other things. This also avoids
> >> having
> >> >> > "scheduled" releases (every 2 weeks, 3 months, etc.) where there
> >> hasn't
> >> >> > been significant updates to necessarily warrant a new release. This
> >> >> > approach may also facilitate differentiating between bugfix (patch)
> >> >> > releases vs. new functionality releases (minor), since when a
> release
> >> is
> >> >> > requested, it could be specified whether it should be patch or
> minor
> >> (or
> >> >> > major).
> >> >> >
> >> >> > 2) Commit to a scheduled release pattern like every 2 weeks, once a
> >> >> month,
> >> >> > etc. This has the advantage of consistency and clearer expectations
> >> for
> >> >> > users/devs involved. A release also doesn't need to be requested,
> >> because
> >> >> > we can just wait for the scheduled time to release. In terms of the
> >> >> > "unnecessary releases" mentioned above, it could be as simple as
> >> >> > "cancelling" a release if there hasn't been significant updates in
> the
> >> >> > elapsed time period.
> >> >> >
> >> >> > My preference would be for 1), but that's influenced from what I'm
> >> >> familiar
> >> >> > with in the Julia package ecosystem. It seems like it would still
> fit
> >> in
> >> >> > the apache way since we would formally request a new release, wait
> the
> >> >> > elapsed amount of time for voting (24 hours would be preferrable),
> >> then
> >> >> at
> >> >> > the end of the voting period, a new release could be made.
> >> >> >
> >> >> > Thanks again kou for helping support the Julia implementation here.
> >> >> >
> >> >> > -Jacob
> >> >> >
> >> >> > 2)
> >> >> >
> >> >> > On Sun, Sep 5, 2021 at 3:25 PM Sutou Kouhei <k...@clear-code.com>
> >> wrote:
> >> >> >
> >> >> > > Hi,
> >> >> > >
> >> >> > > Sorry for the delay. This is a continuation of the "Status
> >> >> > > of Arrow Julia implementation?" thread:
> >> >> > >
> >> >> > >
> >> >> > >
> >> >>
> >>
> https://lists.apache.org/x/thread.html/r6d91286686d92837fbe21dd042801a57e3a7b00b5903ea90a754ac7b%40%3Cdev.arrow.apache.org%3E
> >> >> > >
> >> >> > > I summarize the current status, the next actions and items
> >> >> > > to be discussed.
> >> >> > >
> >> >> > > The current status:
> >> >> > >
> >> >> > >   * The Julia Arrow implementation uses
> >> >> > >     https://github.com/JuliaData/Arrow.jl as a "dev branch"
> >> >> > >     instead of creating a branch in
> >> >> > >     https://github.com/apache/arrow
> >> >> > >   * The Julia Arrow implementation wants to use GitHub
> >> >> > >     for the main issue management platform
> >> >> > >   * The Julia Arrow implementation wants to release
> >> >> > >     more frequency than 1 release per 3-4 months
> >> >> > >   * The current workflow of the Rust Arrow implementation
> >> >> > >     will also fit the Julia Arrow implementation
> >> >> > >
> >> >> > > The current workflow of the Rust Arrow implementation:
> >> >> > >
> >> >> > >
> >> >> > >
> >> >>
> >>
> https://docs.google.com/document/d/1TyrUP8_UWXqk97a8Hvb1d0UYWigch0HAephIjW7soSI/edit#heading=h.kv1hwbhi3cmi
> >> >> > >
> >> >> > >     * Uses apache/arrow-rs and apache/arrow-datafusion instead
> >> >> > >       of apache/arrow for repository
> >> >> > >
> >> >> > >     * Uses GitHub instead of JIRA for issue management
> >> >> > >       platform
> >> >> > >
> >> >> > >
> >> >> > >
> >> >>
> >>
> https://docs.google.com/document/d/1tMQ67iu8XyGGZuj--h9WQYB9inCk6c2sL_4xMTwENGc/edit
> >> >> > >
> >> >> > >     * Releases a new minor and patch version every 2 weeks
> >> >> > >       in addition to the quarterly release of the other releases
> >> >> > >
> >> >> > > The next actions after we get a consensus about this
> >> >> > > discussion:
> >> >> > >
> >> >> > >   1. Start voting the Julia Arrow implementation move like
> >> >> > >      the Rust's one:
> >> >> > >
> >> >> > >
> >> >> > >
> >> >>
> >>
> https://lists.apache.org/x/thread.html/r44390a18b3fbb08ddb68aa4d12f37245d948984fae11a41494e5fc1d@%3Cdev.arrow.apache.org%3E
> >> >> > >
> >> >> > >   2. Create apache/arrow-julia
> >> >> > >
> >> >> > >   3. Start IP clearance process to import JuliaData/Arrow.jl
> >> >> > >      to apache/arrow-julia
> >> >> > >
> >> >> > >      (We don't use julia/Arrow/ in apache/arrow.)
> >> >> > >
> >> >> > >   4. Import JuliaData/Arrow.jl to apache/arrow-julia
> >> >> > >
> >> >> > >   5. Prepare integration tests CI in apache/arrow-julia and
> >> >> apache/arrow
> >> >> > >
> >> >> > >   6. Prepare releasing tools in apache/arrow-julia and
> apache/arrow
> >> >> > >
> >> >> > >   7. Remove julia/... from apache/arrow and leave
> >> >> > >      julia/README.md pointing to apache/arrow-julia
> >> >> > >
> >> >> > >
> >> >> > > Items to be discussed:
> >> >> > >
> >> >> > >   * Interval of minor and patch releases
> >> >> > >
> >> >> > >     * The Rust Arrow implementation uses 2 weeks.
> >> >> > >
> >> >> > >     * Does the Julia Arrow implementation also wants to use
> >> >> > >       2 weeks?
> >> >> > >
> >> >> > >   * Can we accordance with the Apache way with this workflow
> >> >> > >     without pain?
> >> >> > >
> >> >> > >     The Rust Arrow implementation workflow includes the
> >> >> > >     following for this:
> >> >> > >
> >> >> > >
> >> >> > >
> >> >>
> >>
> https://docs.google.com/document/d/1TyrUP8_UWXqk97a8Hvb1d0UYWigch0HAephIjW7soSI/edit#heading=h.kv1hwbhi3cmi
> >> >> > >
> >> >> > >       > Contributors will be required to write issues for
> >> >> > >       > planned features and bug fixes so that we have
> >> >> > >       > visibility and opportunities for collaboration
> >> >> > >       > before a PR shows up.
> >> >> > >
> >> >> > >   * More items?
> >> >> > >
> >> >> > >
> >> >> > > Thanks,
> >> >> > > --
> >> >> > > kou
> >> >> > >
> >> >>
> >>
> > --
> > Thanks,
> > QP Hou
>

Reply via email to