Hi everyone I noticed that removal of Gandiva PR/commit[1] made extra more steps than probably it should
for instance it implicitly completely reverted fixe for CALCITE-7562[2], CALCITE-7567[3], CALCITE-7578[4], CALCITE-7588[5] Was it intentional? [1] https://github.com/apache/calcite/pull/4992 [2] https://issues.apache.org/jira/browse/CALCITE-7562 [3] https://issues.apache.org/jira/browse/CALCITE-7567 [4] https://issues.apache.org/jira/browse/CALCITE-7578 [5] https://issues.apache.org/jira/browse/CALCITE-7588 On Thu, Jun 4, 2026 at 4:53 PM Julian Hyde <[email protected]> wrote: > > +1 > > > On Jun 4, 2026, at 12:57 AM, Alessandro Solimando > > <[email protected]> wrote: > > > > If it's not supported anymore it's already a good enough argument to > > replace it. > > > > The proposed plan looks reasonable to me. > > > > Alessandro > > > >> On Thu, Jun 4, 2026, 09:45 jensen <[email protected]> wrote: > >> > >> I vaguely recall that Arrow's official website also discourages the use of > >> Gandiva, and Gandiva is no longer maintained. If possible, I think we > >> should remove this dependency. > >> > >> > >> > >> Best regards, > >> > >> Zhen > >> > >> ---- Replied Message ---- > >> | From | Cancai Cai<[email protected]> | > >> | Date | 6/4/2026 14:20 | > >> | To | <[email protected]> | > >> | Subject | Subject: [DISCUSS] Removing Gandiva from the Arrow adapter | > >> Hi all, > >> > >> I would like to discuss the future of Gandiva in Calcite's Arrow adapter. > >> My preferred long-term direction is to remove the Gandiva dependency from > >> the adapter. > >> > >> The current adapter uses Arrow Java to read Arrow data, but relies on > >> Gandiva `Projector` and `Filter` for projection and filter execution. > >> Gandiva is a native LLVM-based runtime, and the Java module is a wrapper > >> around that native implementation. As a result, basic Arrow adapter queries > >> depend on native libraries, LLVM compatibility, platform packaging, and JDK > >> baseline details. > >> > >> This has become a practical maintenance problem when thinking about Arrow > >> dependency upgrades. > >> > >> The upgrade problem is not limited to Java bytecode compatibility. In our > >> experiments, newer Arrow versions failed at different layers. Arrow 18 > >> requires a newer Java baseline than Calcite currently supports in its JDK 8 > >> jobs. Arrow 17 and 16.1 still use Java 8 class files, but can hit Java > >> runtime API incompatibilities on JDK 8, such as `ByteBuffer.flip(): > >> ByteBuffer`. Arrow 16.0 avoids that Java runtime issue, but exposed Gandiva > >> native / LLVM symbol issues on Linux CI. > >> > >> This means that as long as `arrow-gandiva` is required for the adapter's > >> correctness path, upgrading the Arrow Java vector layer also requires > >> validating the native Gandiva stack across all CI platforms. Even when > >> `arrow-vector` itself is usable, `arrow-gandiva` can still block the > >> upgrade. > >> > >> For that reason, I think the adapter should make projection/filter > >> correctness independent of Gandiva first. Once the Java correctness path is > >> in place, Arrow vector upgrades can be evaluated separately from Gandiva > >> native compatibility. > >> > >> The direction I have in mind is a pure Java correctness path for the Arrow > >> adapter: > >> > >> * read Arrow data with `ArrowFileReader`, `VectorSchemaRoot`, and > >> `ValueVector`; > >> * execute simple projections by reading selected vectors directly; > >> * execute the simple filters currently translated by `ArrowTranslator` with > >> a Java evaluator; > >> * leave expressions that are not pushed into the adapter to Calcite's > >> normal Enumerable / code generation path. > >> > >> With that model, Gandiva would no longer be required for correctness. A > >> staged migration could be: > >> > >> 1. Move no-filter simple projection away from Gandiva. > >> 2. Add Java evaluation for the simple filter subset currently supported by > >> `ArrowTranslator`. > >> 3. Validate that existing Arrow adapter tests pass without invoking > >> Gandiva. > >> 4. Remove `arrow-gandiva` from the adapter dependency set once the Java > >> path covers the current behavior. > >> > >> The tradeoff is that Gandiva may be faster for supported expressions. But > >> for this adapter, I think correctness, portability, and dependency > >> stability should come first. If acceleration is needed later, it can be > >> discussed separately. > >> > >> Does this direction make sense to the community? Are there current use > >> cases that depend on Gandiva pushdown strongly enough that we should keep > >> the native dependency? > >> > >> Thanks, > >> Cancai > >> -- Best regards, Sergey
