I believe Spark 4.0 was mentioned before. It’ll require Java 17 and will be released in a few months (June?).
Best regards, Adam Lippai On Tue, Dec 5, 2023 at 12:05 David Li <lidav...@apache.org> wrote: > Thanks James for delving into this mess. > > It looks like this change is unavoidable if we want to modularize? I think > this is OK. Will the CLI argument change as we continue modularizing, or is > this the only change that will be needed? > > On Mon, Dec 4, 2023, at 20:07, James Duong wrote: > > Hello, > > > > I did some work to separate the below PR into smaller PRs. > > > > > > * Updating the versions of dependencies and maven plugins is done > > and merged into master. > > * I separated out the work modularizing arrow-vector, > > arrow-memory-core/unsafe, and arrow-memory-netty. > > > > Modularizing arrow-memory-core requires a smaller change to user > > command-line arguments. Instead of: > > --add-opens=java.base/java.nio=ALL-UNNAMED > > > > The user needs to add: > > --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED > > > > I initially tried to modularize arrow-vector separately from > > arrow-memory-core but found that any meaningful operation in > > arrow-vector would trigger an illegal access in memory-core if it > > wasn’t modularized. > > > > I was able to run the tests for arrow-compression and arrow-tools > > successfully after modularizing memory-core, memory-unsafe-, and > > arrow-vector. Note that I had more success by making memory-core and > > memory-unsafe automatic modules. > > > > I think we should make a decision here on if we want to bite the bullet > > and introduce a breaking user-facing change around command-line > > options. The other option is to wait for JDK 21 to modularize. That’s > > farther down the line and requires refactoring much of the memory > > module code and implementing a module using the foreign memory > > interface. > > > > From: James Duong <james.du...@improving.com.INVALID> > > Date: Tuesday, November 28, 2023 at 6:48 PM > > To: dev@arrow.apache.org <dev@arrow.apache.org> > > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform > > Module System > > Hi, > > > > I’ve made some major progress on this work in this PR: > > https://github.com/apache/arrow/pull/38876 > > > > > > * The maven plugin for compiling module-info.java files using JDK 8 > > is working correctly. > > * arrow-format, arrow-memory-core, arrow-memory-netty, > > arrow-memory-unsafe, and arrow-vector have been modularized > > successfully. > > * Tests pass locally for all of these modules. > > * They fail in CI. This is likely from me not updating a profile > > somewhere. > > > > Similar to David’s PR from below, arrow-memory and modules needed to be > > refactored fairly significantly and split into two modules: a > > public-facing JPMS module and a separate module which adds to Netty’s > > packages (memory-netty-buffer-patch). What’s more problematic is that > > because we are using named modules now, users need to add more > > arguments to their Java command line to use arrow. If one were to use > > arrow-memory-netty they would need to add the following: > > > > --add-opens java.base/jdk.internal.misc=io.netty.common > > > --patch-module=io.netty.buffer=${project.basedir}/../memory-netty-buffer-patch/target/arrow-memory-netty-buffer-patch-${project.version}.jar > > > > --add-opens=java.base/java.nio=org.apache.arrow.memory.core,io.netty.common,ALL-UNNAMED > > > > Depending on where the memory-netty-buffer-patch JAR is located, and > > what version, the command the user needs to supply changes, so this > > seems like it’d be really inconvenient. > > > > Do we want to proceed with modularizing existing memory modules? Both > > netty and unsafe? Or wait until the new memory module from Java 21 is > > available? > > > > The module-info.java files are written fairly naively. I haven’t > > inspected thoroughly to determine what packages users will need. > > > > We can continue modularizing more components in a separate PR. Ideally > > all the user breakage (class movement, new command-line argument > > requirements) happens within one major Arrow version. > > > > From: James Duong <james.du...@improving.com.INVALID> > > Date: Tuesday, November 21, 2023 at 1:16 PM > > To: dev@arrow.apache.org <dev@arrow.apache.org> > > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform > > Module System > > I’m following up on this topic. > > > > David has a PR from last year that’s done much of the heavy lifting for > > refactoring the codebase to be package-friendly. > > https://github.com/apache/arrow/pull/13072 > > > > What’s changed since and what’s left: > > > > * New components have been added (Flight SQL for example) that will > > need to be updated for modules. > > * There wasn’t a clear solution on how to do this without breaking > > JDK 8 support. Compiling module-info.java files require using JDK9, but > > using JDK9 breaks using JDK8 methods of accessing sun.misc.Unsafe. > > * There is a Gradle plugin that can compile module-info.java > > files purely syntactically that we can adapt to maven. It has > > limitations (the one I see is that it can’t iterate through > > classloaders to handle annotations), but using this might be a good > > stopgap until we JDK 8 support is deprecated. > > * Some plugins need to be updated: > > * maven-dependency-plugin 3.0.1 can’t parse module-info.class > > files. > > * checkstyle 3.1.0 can’t parse module-info.java files. Our > > existing checkstyle rules file can’t be loaded with newer versions. We > > can exclude module-info.java for now and have a separate Issue for > > updating checkstyle itself and the rules file. > > * grpc-java could not be modularized when the PR above was written. > > * Grpc 1.57 now can be modularized > > (grpc/grpc-java#3522<https://github.com/grpc/grpc-java/issues/3522>) > > > > From: David Dali Susanibar Arce <davi.sar...@gmail.com> > > Date: Wednesday, May 25, 2022 at 5:02 AM > > To: dev@arrow.apache.org <dev@arrow.apache.org> > > Subject: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform Module > System > > Hi All, > > > > This email's purpose is a request for comments to migrate Arrow Java to > JPMS > > Java Platform Module System < > https://openjdk.java.net/projects/jigsaw/spec/> > > JSE 9+ (1). > > > > Current status: > > > > - Arrow Java use JSE1.8 specification > > > > - Arrow Java works with JSE1.8/9/11/17 > > > > - This is possible because Java offers “legacy mode” > > > > Proposal: > > > > Migrate to JPMS Java Platform Module System. This Draft PR > > <https://github.com/apache/arrow/pull/13072>(2< > https://github.com/apache/arrow/pull/13072%3e(2< > https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2 > < > https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2>>>) > > > contains an initial port of > > the modules: Format / Memory Core / Memory Netty / Memory Unsafe / > > Vector > > for evaluation. > > > > Main Reason to migrate: > > > > - JPMS offer Strong encapsulation, Well-defined interfaces > > <https://github.com/nipafx/demo-jigsaw-reflection>, Explicit > dependencies. > > <https://nipafx.dev/java-modules-reflection-vs-encapsulation/> (3)(4) > > > > - JPMS offer reliable configuration and security to hide platform > internals. > > > > - JPMS offers a partial solution to solve problems about read (80%) > /write > > (20%) code. > > > > - JPMS offer optimization for readability about read/write ratio (90/10) > > thru module-info.java. > > > > - Consistency logs, JPMS implement consistency logs to really use that to > > solve the current problem. > > > > - Be able to customize JRE needed with only modules needed (not > > java.desktop for example and others) thru JLink. > > > > - Modules have also been implemented by other languages such as > Javascript > > (ES2015), C++(C++20), Net (Nuget/NetCore).. > > > > - Consider taking a look at this discussion about pros/cons > > < > https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/ > > > > (5). > > > > - Eventual migration to JPMS is a practical necessity as more projects > > migrate. > > > > Effort: > > > > - First of all we need to decide to move from JSE1.8 to JSE9+ or be able > to > > offer support for both jar components JSE1.8 and JSE9+ included. > > > > - Go bottom up for JPMS. > > > > - Packages need to be unique (i.e. org.apache.arrow.memory / > > io.netty.buffer). Review Draft PR with initial proposal. > > > > - Dependencies also need to be modularized. If some of our current > > dependencies are not able to be used as a module this will be a blocker > for > > our modules (we could patch that but this is an extra effort). > > > > Killers: > > > > - FIXME! I need your support to identify killer reasons to be able to > push > > this implementation. > > > > Please let us know if Arrow Java to JPMS Java Platform Module System is > > needed and should be implemented. > > > > Please use this file for any comments > > > https://docs.google.com/document/d/1qcJ8LPm33UICuGjRnsGBcm8dLI08MyiL8BO5JVzTutA/edit?usp=sharing > > > > Resources used: > > > > (1): https://openjdk.java.net/projects/jigsaw/spec/ > > > > (2): https://github.com/apache/arrow/pull/13072 > > > > (3): https://nipafx.dev/java-modules-reflection-vs-encapsulation/ > > > > (4): https://github.com/nipafx/demo-jigsaw-reflection > > > > (5): > > > https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/ > > > > Best regards, > > > > -- > > David >