Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/22737
Change subject: IMPALA-13934: Do quick pointer comparison in IcebergDeleteBuilder ...................................................................... IMPALA-13934: Do quick pointer comparison in IcebergDeleteBuilder Since IMPALA-13194 file paths are deduplicated in the serialized position delete records. Therefore we can do a quick check pointer-based comparison of subsequent position delete records instead of the costly string compare. If the pointers don't match we still need to check the strings for equality because position records coming from different senders can be coalesced into a single row batch by the EXCHANGE RECEIVER. Measurements Data table had ~1 Trillion data records and ~68 Billion position delete records. Average time spent in the IcebergDeleteBuilder: +------------+----------+-----------+ | Node count | Original | Optimized | +------------+----------+-----------+ | 5 | 12m11s | 9m47s | | 10 | 6m2s | 5m | | 20 | 3m1s | 2m30s | | 40 | 1m30s | 1m15s | +------------+----------+-----------+ It's essential to optimize the builder as it blocks all the probe threads of the IcebergDeleteNode. Testing * no behaviour change, existing tests can be used Change-Id: Ie171f912a5518b6e6a445efba9d39748ecec5a36 --- M be/src/exec/iceberg-delete-builder.cc 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/22737/1 -- To view, visit http://gerrit.cloudera.org:8080/22737 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie171f912a5518b6e6a445efba9d39748ecec5a36 Gerrit-Change-Number: 22737 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>