Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22777 )

Change subject: IMPALA-13963: Crash when setting 
'write.parquet.page-size-bytes' to a higher value
......................................................................


Patch Set 6:

I compared the profiles of the failing insert before and after the change.
Before:
Operator          #Hosts  #Inst  Avg Time  Max Time  #Rows  Est. #Rows  Peak 
Mem  Est. Peak Mem  Detail
---------------------------------------------------------------------------------------------------------------------
                                                                                
    
F00:HDFS WRITER        1      1  36.372ms  36.372ms                      1.65 
MB      412.65 KB                                                               
                                                                               
00:SCAN HDFS           1      1   4.908ms   4.908ms  7.30K       6.12K   5.45 
MB      464.00 MB  functional.alltypes

After:
Operator          #Hosts  #Inst  Avg Time  Max Time  #Rows  Est. #Rows  Peak 
Mem  Est. Peak Mem  Detail
---------------------------------------------------------------------------------------------------------------------
                                                                   
F00:HDFS WRITER        1      1  49.677ms  49.677ms                      1.13 
GB      412.65 KB                                                               
                                                              
00:SCAN HDFS           1      1   4.551ms   4.551ms  7.30K       6.12K   5.38 
MB      464.00 MB  functional.alltypes

What happens here is that before the change we didn't allocate a big enough 
buffer for the desired page size and after this change we do.

We already underestimated the memory usage of the writer almost by a factor of 
4 before this change, but the problem here is that the buffer size doesn't 
affect the estimate.


--
To view, visit http://gerrit.cloudera.org:8080/22777
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icb94df8ac3087476ddf1613a1285297f23a54c76
Gerrit-Change-Number: 22777
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Tue, 22 Apr 2025 13:54:39 +0000
Gerrit-HasComments: No

Reply via email to