Hello Everyone, We have discussed this in various locations but always came to the conclusion that due to the nature of the R package as bindings to libarrow, tracking it closely made it more convenient to stick to the monorepo, as Neal said.
I think given the reduced development velocity/maturity of the R package, now is a good time to re-evaluate. As Kou said, the R package already has code to manage backwards compatibility, currently with 15.0.2 as the minimum. So we could stop targeting libarrow-dev as the default version arrow-r is built against and target the latest release instead. Catching issues resulting from api changes would be harder (here our previous discussions about more detailed, informative versioning for libarrow comes into play) but we would also not need to catch up and fix things prior to (or during the R release post step as it often happens) the next monorepo release. And maybe the biggest benefit, contributing would be much less intimidating with a single repo only with R in it, matching what people used to R packages expect. It would then also be easier to ask CRAN to install a compatible version, potentially making builds easier (though the need for a windows arm64 cross-compiled build would remain...). Though, as mentioned we have the compat code and it would be possible to do this now but a bit more decoupling would incentivise it I guess. IIRC with 4.1 being our new minimum version, once the new R version is released in ~April, we might be able to retire a number of jobs and clean up (i.e. openssl 1 jobs as that has reached EOL). I don't think there is a clear choice, and we could very well also do some of the thing in the monorepo i.e. decoupling releases and versioning from libarrow, But given the non-trivial amount of work this would be :shrug: Jacob Am Mo., 3. März 2025 um 15:44 Uhr schrieb Antoine Pitrou <anto...@python.org>: > > > I agree with Neal that the decoupling is less obviously desirable on the > R side. About the number of R-related CI jobs, is there still a need for > testing so many different configurations? > > > Le 03/03/2025 à 15:32, Neal Richardson a écrit : > > Thanks for raising this, Kou. I'm personally torn on this because I see > > some of the upsides of splitting R out, particularly at the project's state > > of maturity, but it's also not as simple as Rust or Java or others we've > > split out in the past because of the hard dependency on the C++ libraries. > > It's not just about integration testing the IPC format. > > > > For better or worse, I can remember tons of instances where the R-related > > CI jobs have caught something in a C++ PR because they test with different > > compilers and toolchains. If we split the projects and all of the R CI jobs > > are only running on the arrow-r repo, does that mean that the R maintainers > > will be continually finding CI failures in the tests that build with the > > latest version of the C++ library and filing bug reports back for the C++ > > project? Either way, it seems that the monorepo would want to keep some R > > testing jobs in crossbow to be able to validate changes, at least to be > > able to confirm that the PR for the issue that the R maintainers filed > > fixes the issue. Maybe this is the way it should be, but it's not clear > > that it reduces the collective maintenance burden. > > > > Just my thoughts based on the historical perspective, I'm happy to defer to > > the judgment of those who are currently shouldering that maintenance burden. > > > > Neal > > > > On Sun, Mar 2, 2025 at 7:53 PM Sutou Kouhei <k...@clear-code.com> wrote: > > > >> Hi, > >> > >> This is a similar discussion to the "[DISCUSS] Split Go > >> release process" thread[1] and the "[DISCUSS] Split Java > >> release process" thread[2]: > >> > >> [1] https://lists.apache.org/thread/fstyfvzczntt9mpnd4f0b39lzb8cxlyf > >> [2] https://lists.apache.org/thread/b99wp2f3rjhy09sx7jqvrfqjkqn9lnyy > >> > >> We've split them and they were released from separated > >> repositories. > >> > >> Let's discuss the next target. > >> > >> We raised JavaScript as the next candidate in the Java > >> discussion[3] but we may not find one or more active release > >> managers for JavaScript. > >> > >> [3] https://lists.apache.org/thread/bdko84zy72nlg3k82t772f7pq6zpd0sz > >> > >> I propose R as the next candidate because: > >> > >> * We have many active committers and PMC members who can > >> focus on R > >> * The current R release process is semi-separated > >> * In general, we release R packages to CRAN by non-trivial > >> release process after our monorepo release. > >> e.g.: https://github.com/apache/arrow/issues/45581 > >> * The R bindings can also work with old C++ versions > >> * The R bindings don't need to align with the monorepo > >> versioning. The R bindings can avoid major version up > >> per 3-4 months. > >> * We have many R related CI jobs. If we split the R > >> bindings, we can remove many CI jobs from monorepo. > >> > >> > >> What do you think about this? > >> > >> > >> Thanks, > >> -- > >> kou > >> > >