Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21605 )

Change subject: IMPALA-13247: Support Reading Puffin files for the current 
snapshot
......................................................................


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@471
PS6, Line 471:   IcebergColumn col = getColumnByIcebergF
> - ndv > getTTableStats().getNum_rows(): getTTableStats().getNum_rows() is a
I'm a little bit concern if ndv > cardinality. Some planner code make 
assumption about ndv vs cardinality like here:

https://github.com/apache/impala/blob/07d44b7affa0652f14d8044cd7ffa604f250b77b/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java#L711-L722

https://github.com/apache/impala/blob/07d44b7affa0652f14d8044cd7ffa604f250b77b/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java#L106-L110

For regular HMS table, TTableStats.num_rows is coming from table properties 
numRows from HMS, which populated by COMPUTE STATS or sometimes overridden by 
user themself. For Iceberg table, is it following the metadata files or HMS 
table property? I'm guessing it follows metadata files?
https://github.com/apache/impala/blob/07d44b7affa0652f14d8044cd7ffa604f250b77b/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java#L383


http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java
File fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java:

http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@58
PS6, Line 58: this.blob = blo
> It comes from input outside of Impala, so it could be negative (OTOH I don'
Done



--
To view, visit http://gerrit.cloudera.org:8080/21605
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50c1228988960a686d08a9b2942e01e366678866
Gerrit-Change-Number: 21605
Gerrit-PatchSet: 8
Gerrit-Owner: Daniel Becker <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 22 Aug 2024 15:07:07 +0000
Gerrit-HasComments: Yes

Reply via email to