Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21797 )

Change subject: IMPALA-12594: Add flag to tune KrpcDataStreamSender mem estimate
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21797/2/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java
File fe/src/main/java/org/apache/impala/planner/DataStreamSink.java:

http://gerrit.cloudera.org:8080/#/c/21797/2/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java@98
PS2, Line 98:     if (fixedLenRowSize==0) fixedLenRowSize = 1; // avoid 
division by 0
            :     long beRowsPerBuffer =
            :         Math.max(1, 
(long)Math.ceil(beBufferBytes/fixedLenRowSize));
> This can blow up if all columns are var-len type, and data_stream_sender_bu
As we also discussed on another channel this shouldn't lead to much more rows 
than the original batch_size of 1024.
A string (or collection) column adds 12 bytes to fixed len size, while 1024 
rows will be used when the the row size is 16 bytes (default 
data_stream_sender_buffer_size=16K = 1024*16), so a single large string column 
can raise the memory estimate, but not by much.



--
To view, visit http://gerrit.cloudera.org:8080/21797
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e4b1db030be934cece565e3f2634ee7cbdb7c4f
Gerrit-Change-Number: 21797
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Sat, 14 Sep 2024 09:29:09 +0000
Gerrit-HasComments: Yes

Reply via email to