Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/21932 )
Change subject: IMPALA-13509: Copy rows directly to OutboundRowBatch during hash partitioning ...................................................................... Patch Set 12: (5 comments) Great improvement! http://gerrit.cloudera.org:8080/#/c/21932/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21932/12//COMMIT_MSG@67 PS12, Line 67: TPCH benchmarks improved significantly Nice improvements! Did you also measure TPC-DS? http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/descriptors.h File be/src/runtime/descriptors.h: http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/descriptors.h@845 PS12, Line 845: int IR_NO_INLINE num_tuples_no_inline() const { return tuple_desc_map_.size(); } nit: Could you please add comment to this function? http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/krpc-data-stream-sender-ir.cc File be/src/runtime/krpc-data-stream-sender-ir.cc: http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/krpc-data-stream-sender-ir.cc@35 PS12, Line 35: while (row_idx < num_rows) { : int row_count = 0; : FOREACH_ROW_LIMIT(batch, row_idx, RowBatch::HASH_BATCH_SIZE, row_batch_iter) { : TupleRow* row = row_batch_iter.Get(); : channel_ids[row_count++] = HashRow(row, exchange_hash_seed_) % num_channels; : } : row_count = 0; : FOREACH_ROW_LIMIT(batch, row_idx, RowBatch::HASH_BATCH_SIZE, row_batch_iter) { : int channel_id = channel_ids[row_count++]; : PartitionRowCollector& collector = partition_row_collectors_[channel_id]; : RETURN_IF_ERROR(collector.AppendRow(row_batch_iter.Get(), row_desc_)); : } : row_idx += row_count; : } This code is a bit subtle due to the micro-optimization introduced by https://gerrit.cloudera.org/#/c/9221/ Probably it'd worth a short comment and pointer to the original Jira (IMPALA-6461). http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/krpc-data-stream-sender.h File be/src/runtime/krpc-data-stream-sender.h: http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/krpc-data-stream-sender.h@184 PS12, Line 184: TransmiteData typo http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/outbound-row-batch.inline.h File be/src/runtime/outbound-row-batch.inline.h: http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/outbound-row-batch.inline.h@76 PS12, Line 76: &tuple_data_[0]) + tuple_data_.size() Maybe 'tuple_data_.back()' is simpler. -- To view, visit http://gerrit.cloudera.org:8080/21932 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81a16c2f0fcfc1f3adef7077b3932a29a0f15a8f Gerrit-Change-Number: 21932 Gerrit-PatchSet: 12 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 08 Nov 2024 17:01:02 +0000 Gerrit-HasComments: Yes
