There was some informal discussion about this in the biweekly Arrow community meeting on November 6. One concern raised was that implementations outside the monorepo could suffer from an "out of sight, out of mind" problem. For example, because the new apache/arrow-java repo will have many fewer community members watching it compared to the apache/arrow repo, there might be less public participation in issues and PRs. But I don't know whether this will really be a problem in practice. Has our experience moving the Go implementation to apache/arrow-go given us any clues about this?
Ian On Mon, Nov 18, 2024 at 8:03 AM Neal Richardson <neal.p.richard...@gmail.com> wrote: > +1, makes sense to me. Since many of the Arrow libraries are at a more > stable phase of development, it makes sense that they should have a release > process and cadence that matches, and that those that are under active > development should be able to move ahead more freely. > > Neal > > > On Mon, Nov 18, 2024 at 3:25 AM Raúl Cumplido <rau...@apache.org> wrote: > > > I am also +1 on moving the Java implementation to its own repository. > > > > I am also happy to help with the releases on the Java repository and, > > as with Go, I am also happy to help with some of the migration tasks. > > > > As a side note, in general I tend to have more problems with > > JavaScript when releasing, verifying, etcetera. I would love for > > JavaScript to move to its own repository but here I think we have to > > make a bigger effort on finding committers and PMCs to have an active > > and stable maintenance. > > > > Thanks, > > Raúl > > > > El lun, 18 nov 2024 a las 9:17, David Li (<lidav...@apache.org>) > escribió: > > > > > > While early, it seems to have gone well for Go. The ability to do more > > frequent point releases for Java might be interesting. > > > > > > Since the JNI JARs bundle the C++ libraries when built, we can build > off > > of the latest released C++ version whenever the Java project needs to > > release, and not have to worry too much about maintaining runtime > > compatibility with multiple versions (only with making sure we aren't > > "caught out" by upstream changes). > > > > > > On Mon, Nov 18, 2024, at 17:07, Gang Wu wrote: > > > > +1 on splitting the Java codebase! > > > > > > > > I'm not an active contributor/reviewer to the Java codebase, though I > > have > > > > several contributions to it in the past. I can volunteer to be a > > release > > > > manager > > > > on the Java side if I can help. I have some experience in releasing > > orc and > > > > parquet-java in the past. > > > > > > > > Best, > > > > Gang > > > > > > > > On Mon, Nov 18, 2024 at 4:01 PM Antoine Pitrou <anto...@python.org> > > wrote: > > > > > > > >> > > > >> Hi Kou, > > > >> > > > >> Thanks a lot for bringing this. > > > >> > > > >> I'm +1 on the principle, both for splitting the Java release process > > and > > > >> moving the Java implementation into another repository. > > > >> > > > >> We do need to find more maintainers for Arrow Java, but that is true > > > >> regardless of whether the Java implementation stays in the monorepo. > > > >> > > > >> (also, I don't know if David Li would like to be described as > > > >> "Java-focused" :-)) > > > >> > > > >> Regards > > > >> > > > >> Antoine. > > > >> > > > >> > > > >> Le 18/11/2024 à 08:55, Sutou Kouhei a écrit : > > > >> > Hi, > > > >> > > > > >> > This is a similar discussion to the "[DISCUSS] Split Go > > > >> > release process" thread: > > > >> > https://lists.apache.org/thread/fstyfvzczntt9mpnd4f0b39lzb8cxlyf > > > >> > > > > >> > How about splitting Java release process from other > > > >> > apache/arrow components like apache/arrow-go? Here are > > > >> > some reasons of this proposal: > > > >> > > > > >> > * The Java implementation is a native implementation not > > > >> > bindings > > > >> > * Some modules are the bindings of the C++ implementation > > > >> > but we can support multiple C++ version if needed like > > > >> > the R implementation does > > > >> > * We can simplify apache/arrow release by splitting the Java > > > >> > implementation > > > >> > * We'll be able to use more minor/patch releases for the > > > >> > Java implementation instead of major releases like the Go > > > >> > implementation > > > >> > > > > >> > Here is my idea how to proceed this: > > > >> > > > > >> > 1. Extract java/ in apache/arrow to apache/arrow-java like > > > >> > apache/arrow-go > > > >> > * Filter java/ related commits from apache/arrow and create > > > >> > apache/arrow-java with them like we did for apache/arrow-go > > > >> > * Remove java/ related codes from apache/arrow > > > >> > 2. Prepare integration test CI like apache/arrow-go does: > > > >> > > > > >> > > https://github.com/apache/arrow-go/blob/main/.github/workflows/test.yml > > > >> > 3. Prepare release script based on apache/arrow-go > > > >> > > > > >> > We can reuse some release scripts in dev/release/ in > > > >> > apache/arrow like we did in apache/arrow-adbc. > > > >> > > > > >> > Cons of this idea: > > > >> > > > > >> > * There is only one active Java focused PMC member and > > > >> > committer: David Li > > > >> > * We need to increase active Java focused PMC members and > > > >> > committers for stable maintenance > > > >> > > > > >> > > > > >> > What do you think about this? > > > >> > > > > >> > > > > >> > Thanks, > > > >> > > >