Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21563 )

Change subject: IMPALA-13194: Fast-serialize position delete records
......................................................................


Patch Set 2:

(16 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/21563/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21563/1//COMMIT_MSG@13
PS1, Line 13: e
> Nit: tuples?
Done


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.h
File be/src/runtime/krpc-data-stream-sender.h:

http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.h@308
PS1, Line 308:   /// A mapping from Channel to IcebergPositionDeleteChannel. 
Only used in DIRECTED mode
> Could add a comment that describes this variable.
Done


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@234
PS1, Line 234:  private:
> line has trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@633
PS1, Line 633:
> I think it would be cleaner if we returned OutboundRowBatch*. AFAICS KrpcDa
TransmitData() requires a unique_ptr so it can swap outbound batches.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@760
PS1, Line 760: }
> Can you add some comments about the role of this class?
Added comment. Also moved out most of the contents to 
IcebergPositionDeleteCollector.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@797
PS1, Line 797:     return Status::OK();
> Can channel_->RowBatchCapacity() ever be 0 at L775? If it can then this che
RowBatchCapacity() is always at least 1 (there is a max(1, <expr>) in its 
body), otherwise the channels couldn't send any rows.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@855
PS1, Line 855:     }
> It should compress if the channel is not local.
Ah, thanks for catching this.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@857
PS1, Line 857:  (IsDirectedM
> What if 'tuple_data_size' is 0? In this case 'tuple_data' may be a nullptr
Ubsan::MemSet() handles exactly this case.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@943
PS1, Line 943:  i64
> I think writing the actual type (Channel*) is easier to read.
Actual type would be 'unique_ptr<impala::KrpcDataStreamSender::Channel>&', I'm 
not sure if that would be an improvement in readability in this otherwise 
trivial case.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@945
PS1, Line 945:                  i64 %seed) #49 {
> Could extract into a variable before the loop.
Done


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@1282
PS1, Line 1282:
> Not changed in this patch, but it should be 'tuple'.
Done


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/outbound-row-batch.h
File be/src/runtime/outbound-row-batch.h:

http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/outbound-row-batch.h@72
PS1, Line 72:   Status Finalize(int num_tuples_per_row, TrackedString* 
compression_scratch);
> line has trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/row-batch.cc@272
PS1, Line 272:   // Switch to using full deduplication in cases where severe 
size blow-ups are known to
> +1 for moving these to another outbound-row-batch.cc
Thanks, to make it more safe I've replaced it with an 
OutboundRowBatch::Finalize() method.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/string-value.h
File be/src/runtime/string-value.h:

http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/string-value.h@200
PS1, Line 200: inline std::size_t hash_value(const StringValue& v) {
> Not changed in this patch, but how does this work with small strings? Strin
Ptr() returns a pointer to the data, and Len() returns the actual length, so it 
should work well with small strings.

SimpleStrings wouldn't make any difference as their 'ptr' and 'len' members 
would have the same value as Ptr() and Len(). SimpleStrings are only used in 
cases where they make the code more readable or optimal.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/string-value.h@204
PS1, Line 204: std::ostream& operator<<(std::ostream& os, const StringValue& 
string_value);
> Do you think we should consider specialising std::hash for StringValue or s
Yeah, I think we should. We already do that for boost containers (see 
hash_value above), so I don't see why we shouldn't for std containers. Done.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/string-value.h@206
PS1, Line 206:
> Do we need this qualifier?
Done



--
To view, visit http://gerrit.cloudera.org:8080/21563
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6095f318e3d06dedb4197681156b40dd2a326c6f
Gerrit-Change-Number: 21563
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Sun, 14 Jul 2024 15:10:48 +0000
Gerrit-HasComments: Yes

Reply via email to