Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21605 )

Change subject: IMPALA-13247: Support Reading Puffin files for the current 
snapshot
......................................................................


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@471
PS6, Line 471:   IcebergColumn col = getColumnByIcebergF
> Do we cap the NDV stats to the number of rows in the case of HMS statistics
There is a way to keep the testing simple while capping NDV to numRows: in the 
test, we can set the 'numRows' table property explicitly to a large value.

However, I tried setting this table property, then setting NDV manually (as an 
HMS stat), and that is not capped. In this case I think it would be strange to 
cap the Puffin NDV value while not doing the same with the HMS NDV value. 
Probably we should leave it as it is in this patch and maybe open a Jira to cap 
the values in both cases. What do you think?


http://gerrit.cloudera.org:8080/#/c/21605/9/java/puffin-data-generator/src/main/java/org/apache/impala/puffindatagenerator/PuffinDataGenerator.java
File 
java/puffin-data-generator/src/main/java/org/apache/impala/puffindatagenerator/PuffinDataGenerator.java:

http://gerrit.cloudera.org:8080/#/c/21605/9/java/puffin-data-generator/src/main/java/org/apache/impala/puffindatagenerator/PuffinDataGenerator.java@371
PS9, Line 371:
> There are som other parameters of the PuffinWriter that we don't have cover
I added "created-by" for all files.

Added blob compression for some of the generated files. I also tried footer 
compression but that doesn't work yet, because the spec only allows LZ4 for 
footer compression but the Iceberg lib does not yet support LZ4.



--
To view, visit http://gerrit.cloudera.org:8080/21605
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50c1228988960a686d08a9b2942e01e366678866
Gerrit-Change-Number: 21605
Gerrit-PatchSet: 11
Gerrit-Owner: Daniel Becker <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 03 Sep 2024 15:14:55 +0000
Gerrit-HasComments: Yes

Reply via email to