Joe McDonnell has uploaded a new patch set (#5) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/22371 )
Change subject: IMPALA-13660: Support caching broadcast hash joins ...................................................................... IMPALA-13660: Support caching broadcast hash joins This extends tuple caching to be able to cache above joins. As part of this, ExchangeNodes are now eligible for broadcast and directed exchanges. This does not yet support partitioned exchanges. Since an exchange passes data from all nodes, this incorporates all the scan range information when passing through an exchange. For joins with a separate build side, a cache hit above the join means that a probe-side thread will never arrive. If the builder is not notified, it will wait for that thread to arrive and extend the latency of the query significantly. This adds code to notify the builder when a thread will never participate in the probe phase. Testing: - Added test cases to TestTupleCace including with distributed plans. - Added test cases to test_tuple_cache.py to verify behavior when updating the build side table and the timing of a cache hit. - Performance tests with TPC-DS at scale Change-Id: Ic61462702b43175c593b34e8c3a14b9cfe85c29e --- M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/join-builder.cc M be/src/exec/join-builder.h M be/src/exec/nested-loop-join-node.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/tuple-cache-node.cc M fe/src/main/java/org/apache/impala/common/ThriftSerializationCtx.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java M fe/src/main/java/org/apache/impala/planner/JoinBuildSink.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java M fe/src/main/java/org/apache/impala/planner/TupleCachePlanner.java M fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java M tests/custom_cluster/test_tuple_cache.py 19 files changed, 604 insertions(+), 172 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/22371/5 -- To view, visit http://gerrit.cloudera.org:8080/22371 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic61462702b43175c593b34e8c3a14b9cfe85c29e Gerrit-Change-Number: 22371 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Yida Wu <wydbaggio...@gmail.com>