Hi,

Sorry for not working on this.

Thanks for sharing the standard docs! I've read it and
related docs.

Here is the summary I learned in this thread and the
standard docs:

1. We're using "github.com/apache/arrow/go/v${VERSION}" such
   as "github.com/apache/arrow/go/v17" as our module name
   * https://pkg.go.dev/github.com/apache/arrow/go/v17/arrow
   * Including the version number part ("v${VERSION}") is
     important
   * Users can avoid unexpected backward incompatibility by
     this style
2. We used to use "github.com/apache/arrow/go" as our module
   name in v5 or earlier
   * https://pkg.go.dev/github.com/apache/arrow/go/arrow
   * 133 modules still use this
3. We want to avoid user side changes as much as possible
   * As 2. shows, users may keep using old version if there
     is any change is required
4. The current users need to change Apache Arrow Go's import
   path to "github.com/apache/arrow/go/v${VERSION + 1}" when
   they want to upgrade Apache Arrow Go
   * We don't want to require more changes than "changing
     import path" for users as mentioned in 3.
5. We can't provide backward compatible module name such as
   "github.com/apache/arrow/go/v18" for
   "github.com/apache/arrow-go/v18"
   * Go doesn't provide the feature
6. We want to keep "v${VERSION}" in our module name even if
   we split Apache Arrow Go to apache/arrow-go
   * It's for avoiding unexpected backward incompatibility
     in users' projects


Based on 6., users need to change their import paths on
upgrade whether we keep using apache/arrow or we use new
apache/arrow-go.

If we use new apache/arrow-go, we will be able to reduce
maintenance cost for apache/arrow (e.g. we can remove Go
related scripts, CI jobs and so on from apache/arrow). Let's
use apache/arrow-go.

If nobody objects splitting Apache Arrow Go to
apache/arrow-go in this week, I'll start working on this
next week. (Do we need a vote for this?)


Thanks,
-- 
kou

In <cah4123zxadcug6yrkz2mxupke1muftyrvhg0hh1bqck5fw+...@mail.gmail.com>
  "Re: [DISCUSS] Split Go release process" on Mon, 22 Jul 2024 20:47:57 -0400,
  Matt Topol <zotthewiz...@gmail.com> wrote:

> Hey Kou,
> 
> https://go.dev/doc/modules/release-workflow is the standard docs for
> developing module versioning and publishing with Go.
> 
> There isn't really a way to alias an import path to a different git repo
> because it uses the GitHub URL itself as the import path.
> 
> But it does seem like people seem to prefer the idea of shifting the Go
> implementation to its own repository. I'd still push for us to include the
> major version number in the import path, and since we'll have fewer major
> releases and more minor releases, users shouldn't have to update their
> import paths as frequently.
> 
> --Matt
> 
> On Mon, Jul 22, 2024, 8:37 PM Sutou Kouhei <k...@clear-code.com> wrote:
> 
>> Hi,
>>
>> >                  Kou, is your plan also counting on moving the
>> > specific nightlies there and removing them from the main repo?
>>
>> Yes. I should have mentioned it explicitly.
>>
>> We will remove most Go related CI jobs from apache/arrow. We
>> will keep Go in integration test CI jobs like we do for
>> apache/arrow-rs.
>>
>>
>> Thanks,
>> --
>> kou
>>
>> In <cad1rbrr2vtxaunppfrrjgfd+ofca3q4f+yr6npku4ttzlx2...@mail.gmail.com>
>>   "Re: [DISCUSS] Split Go release process" on Fri, 19 Jul 2024 17:14:25
>> +0200,
>>   Raúl Cumplido <raulcumpl...@gmail.com> wrote:
>>
>> > Hi,
>> >
>> > The conversation around more frequent minor releases and version split
>> > per component has been a long one.
>> >
>> > I am in favour of these changes for the Go implementation because we
>> > have several maintainers.
>> >
>> > It might be difficult to release other implementations that do not
>> > have the same amount of maintainers. I am not sure what our plan is if
>> > one of the split implementations has less maintainers and there's a
>> > requirement for a release (i.e. security fix) but that might be
>> > something to consider in the future.
>> >
>> >> I would defer to Raul and Jacob to corroborate this, but because
>> >> changes to the CI configuration and release verification scripts don't
>> >> affect other implementations, I have been able to maintain that
>> >> infrastructure myself without too much effort and don't have to lean
>> >> on them for anything except reviews.
>> >
>> > I think releasing and maintaining release scripts / verifications per
>> > component is much easier than for the mono repo. We currently have
>> > over 200 nightly CI jobs in the mono repo that are required to pass
>> > before releasing. Moving some of those to its own repo helps
>> > maintainability. Kou, is your plan also counting on moving the
>> > specific nightlies there and removing them from the main repo?
>> >
>> > I would be in favour of doing a new major release (v18) once the repo
>> > and the changes are in-place to update the import path to something
>> > like:
>> > github.com/apache/arrow-go/v18
>> >
>> > This would avoid confusion with previous releases. We can then follow
>> > up with patch/minor/major as required.
>> >
>> > I am also happy to help with the releases and infrastructure if
>> > necessary as I've done with the main Arrow one (I can also help on
>> > nanoarrow, adbc if necessary).
>> >
>> > Kind regards,
>> > Raul
>> >
>> >
>> >
>> >>
>> >> [1] https://github.com/apache/arrow-nanoarrow/pull/557
>> >>
>> >> On Thu, Jul 18, 2024 at 7:53 PM Matt Topol <zotthewiz...@gmail.com>
>> wrote:
>> >> >
>> >> > Part of the goal of splitting out the release processes is that we'd
>> be
>> >> > able to do minor version releases more frequently instead of major
>> version
>> >> > releases.
>> >> >
>> >> > The general convention in the Go community is to include a major
>> version
>> >> > "v#" in the import path for all major versions past v1 so that if
>> there's a
>> >> > breaking change, it's explicit and prevents potential issues from
>> different
>> >> > major versions being used simultaneously. Being able to do minor
>> version
>> >> > releases more frequently would lead to not having to change the import
>> >> > paths every 3-6 months, but only if we actually do a breaking change.
>> >> >
>> >> > On Thu, Jul 18, 2024, 3:55 PM George Godik <ggo...@gmail.com> wrote:
>> >> >
>> >> > > > If we shift the Go lib to a new/different import
>> >> > > path we'll end up with the same problem where people will rely on
>> older
>> >> > > versions and an incorrect path.
>> >> > >
>> >> > > Major version upgrades already require changing the import paths by
>> >> > > increasing the version. The proposed change would require everyone
>> to go
>> >> > > through a similar process one last time.
>> >> > >
>> >> > > > More to the point, there would be the question of whether or not
>> we
>> >> > > should port over the same major version
>> >> > > number, i.e. `github.com/apache/arrow-go/v17`
>> <http://github.com/apache/arrow-go/v17>
>> >> > > <http://github.com/apache/arrow-go/v17>
>> >> > > <http://github.com/apache/arrow-go/v17> or something to that end?
>> Or
>> >> > > do we restart back at v1 (which I think would be confusing)?
>> >> > >
>> >> > > My vote - for whatever it's worth  - would be to do away with the
>> >> > > version-in-path naming convention and relying on the go
>> version/package
>> >> > > system for major upgrades.
>> >> > >
>> >> > > Benefits: I don't have to change import paths every 3-6months
>> >> > >
>> >> > > On Thu, Jul 18, 2024 at 3:34 PM Matt Topol <zotthewiz...@gmail.com>
>> wrote:
>> >> > >
>> >> > > > My thoughts:
>> >> > > >
>> >> > > > > * Go doesn't depend on other components such as C++
>> >> > > > > * Go has some active PMC member (Matt) and committer (Joel)
>> >> > > > >   * Could you become a release manager for Go?
>> >> > > >
>> >> > > > I'd happily be the release manager for the Go implementation.
>> >> > > >
>> >> > > > > Here is my idea how to proceed this:
>> >> > > > >
>> >> > > > > 1. Extract go/ in apache/arrow to apache/arrow-go like
>> >> > > > >     apache/arrow-rs
>> >> > > > >     * Filter go/ related commits from apache/arrow and create
>> >> > > > >       apache/arrow-go with them like we did for apache/arrow-rs
>> >> > > > >     * Remove go/ related codes from apache/arrow
>> >> > > > > 2. Prepare integration test CI like apache/arrow-rs does:
>> >> > > > >
>> >> > > >
>> >> > > >
>> >> > >
>> https://github.com/apache/arrow-rs/blob/master/.github/workflows/integration.yml
>> >> > > > > 3. Prepare release script based on apache/arrow-julia,
>> >> > > > >     apache/arrow-adbc and/or apache/arrow-flight-sql-postgresql
>> >> > > >
>> >> > > > Personally I would prefer that we do not extract it to its own
>> separate
>> >> > > > repository purely because I don't want to change the import path
>> for
>> >> > > users
>> >> > > > again. We already have this issue from before we introduced the
>> major
>> >> > > > version into the import path and shifted it up to allow for the
>> Parquet
>> >> > > lib
>> >> > > > in the same repository. If you look at [1] you see that there's
>> still
>> >> > > over
>> >> > > > 100 projects that never upgraded to v6 or higher because they are
>> still
>> >> > > > using the old import path. If we shift the Go lib to a
>> new/different
>> >> > > import
>> >> > > > path we'll end up with the same problem where people will rely on
>> older
>> >> > > > versions and an incorrect path.
>> >> > > >
>> >> > > > If we as a community decide that splitting out the
>> implementations all
>> >> > > into
>> >> > > > separate repositories is the best way forward, I won't hold it up
>> by
>> >> > > > strictly hammering on this. I'm just concerned about the
>> realities and
>> >> > > > difficulties of communicating the import path change, ensuring we
>> don't
>> >> > > > break any consumers, and ensuring that users still end up being
>> able to
>> >> > > > upgrade easily.
>> >> > > >
>> >> > > > > The import path could be "github.com/apache/arrow-go" instead
>> of "
>> >> > > > github.com/apache/arrow-go/arrow". Since go will allow users to
>> use
>> >> > > > `arrow.Abc` directly if user imports `github.com/apache/arrow-go`
>> <http://github.com/apache/arrow-go>
>> >> > > <http://github.com/apache/arrow-go>
>> >> > > > <http://github.com/apache/arrow-go>
>> >> > > > <http://github.com/apache/arrow-go>.
>> >> > > >
>> >> > > > The import path would still have to be `
>> >> > > github.com/apache/arrow-go/arrow`
>> <http://github.com/apache/arrow-go/arrow>
>> >> > > <http://github.com/apache/arrow-go/arrow>
>> >> > > > <http://github.com/apache/arrow-go/arrow>
>> >> > > > since it would also contain the parquet implementation in `
>> >> > > > github.com/apache/arrow-go/parquet`
>> <http://github.com/apache/arrow-go/parquet>
>> >> > > <http://github.com/apache/arrow-go/parquet>
>> >> > > > <http://github.com/apache/arrow-go/parquet>. More to the point,
>> there
>> >> > > > would be the
>> >> > > > question of whether or not we should port over the same major
>> version
>> >> > > > number, i.e. `github.com/apache/arrow-go/v17`
>> <http://github.com/apache/arrow-go/v17>
>> >> > > <http://github.com/apache/arrow-go/v17>
>> >> > > > <http://github.com/apache/arrow-go/v17> or something to that
>> end? Or
>> >> > > > do we restart back at v1 (which I think would be confusing)?
>> >> > > >
>> >> > > > --Matt
>> >> > > >
>> >> > > > [1]: https://pkg.go.dev/github.com/apache/arrow/go/arrow
>> >> > > >
>> >> > > > On Thu, Jul 18, 2024 at 7:33 AM Antoine Pitrou <
>> anto...@python.org>
>> >> > > wrote:
>> >> > > >
>> >> > > > >
>> >> > > > > Hi Kou,
>> >> > > > >
>> >> > > > > Le 18/07/2024 à 11:33, Sutou Kouhei a écrit :
>> >> > > > > >
>> >> > > > > > Here is my idea how to proceed this:
>> >> > > > > >
>> >> > > > > > 1. Extract go/ in apache/arrow to apache/arrow-go like
>> >> > > > > >     apache/arrow-rs
>> >> > > > > >     * Filter go/ related commits from apache/arrow and create
>> >> > > > > >       apache/arrow-go with them like we did for
>> apache/arrow-rs
>> >> > > > > >     * Remove go/ related codes from apache/arrow
>> >> > > > > > 2. Prepare integration test CI like apache/arrow-rs does:
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> https://github.com/apache/arrow-rs/blob/master/.github/workflows/integration.yml
>> >> > > > > > 3. Prepare release script based on apache/arrow-julia,
>> >> > > > > >     apache/arrow-adbc and/or
>> apache/arrow-flight-sql-postgresql
>> >> > > > >
>> >> > > > > I think this is a good idea, but I'm not part of the Go
>> maintainers.
>> >> > > > >
>> >> > > > > > Cons of this idea:
>> >> > > > > >
>> >> > > > > > * This is a backward incompatible change
>> >> > > > > >    * Users need to change their "import" to
>> >> > > > > >      "github.com/apache/arrow-go/arrow" from
>> >> > > > > >      "github.com/apache/arrow/go/arrow"
>> >> > > > >
>> >> > > > > Is there no way to leave some kind of alias or redirection in
>> the
>> >> > > > > apache/arrow repository?
>> >> > > > >
>> >> > > > > Regards
>> >> > > > >
>> >> > > > > Antoine.
>> >> > > > >
>> >> > > >
>> >> > >
>>

Reply via email to