Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22777 )

Change subject: IMPALA-13963: Crash when setting 
'write.parquet.page-size-bytes' to a higher value
......................................................................


Patch Set 2: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/22777/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/22777/2//COMMIT_MSG@12
PS2, Line 12:   create table lineitem_iceberg_comment stored as iceberg as
            :     select l_comment from tpch_parquet.lineitem union all
            :     select l_comment from tpch_parquet.lineitem;
            :
            :   alter table lineitem_iceberg_comment
            :     set tblproperties("write.parquet.page-size-bytes"="6000000");
            :
            :   insert into lineitem_iceberg_comment
            :     select l_comment from tpch_parquet.lineitem union all
            :     select l_comment from tpch_parquet.lineitem;
The reproduction could be simplified:
- the create table doesn't need inserts
- smaller value for write.parquet.page-size-bytes would be enough, e.g. 1MB
- the union may be important in the last insert

Also, mt_dop could be used to force creating less files (thus with potentially 
bigger pages)


http://gerrit.cloudera.org:8080/#/c/22777/2/testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test
File testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test:

http://gerrit.cloudera.org:8080/#/c/22777/2/testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test@335
PS2, Line 335: ---- QUERY
it is not necessary to do this in the same patch, bug generally it would be 
nice to validate the other checks about whether they actually test what these 
properties - for example do these limits are actually hit? it would be good to 
verify that both write.parquet.page-size-bytes and 
write.parquet.dict-size-bytes can increase/decrease the actual page size 
compared to the defaul



--
To view, visit http://gerrit.cloudera.org:8080/22777
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icb94df8ac3087476ddf1613a1285297f23a54c76
Gerrit-Change-Number: 22777
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Mon, 14 Apr 2025 12:21:26 +0000
Gerrit-HasComments: Yes

Reply via email to