alamb commented on issue #18070: URL: https://github.com/apache/datafusion/issues/18070#issuecomment-3416788644
Fascinating, it turns out to have nothing to do with array_has, it is related somehow to the NestedLoopsJoin - 1206019e1ad1fb923f85e3c62335de5ef75c683a / https://github.com/apache/datafusion/pull/16996 is the first bad commit I will post a note on that ticket ```shell 1206019e1ad1fb923f85e3c62335de5ef75c683a is the first bad commit commit 1206019e1ad1fb923f85e3c62335de5ef75c683a (HEAD) Author: Yongting You <[email protected]> Date: Fri Aug 15 02:20:16 2025 +0800 Rewrite Nested Loop Join executor for 5× speed and 1% memory usage (#16996) * Rewrite NestedLoopJoin for better performance and memory efficiency --------- Co-authored-by: Matt Butrovich <[email protected]> benchmarks/README.md | 78 +++-- benchmarks/bench.sh | 18 ++ benchmarks/src/bin/dfbench.rs | 4 +- benchmarks/src/lib.rs | 1 + benchmarks/src/nlj.rs | 264 +++++++++++++++++ datafusion/physical-plan/src/joins/nested_loop_join.rs | 2018 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------- datafusion/physical-plan/src/joins/utils.rs | 17 ++ datafusion/sqllogictest/test_files/joins.slt | 6 +- 8 files changed, 1634 insertions(+), 772 deletions(-) create mode 100644 benchmarks/src/nlj.rs ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
