Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21605 )
Change subject: IMPALA-13247: Support Reading Puffin files for the current snapshot ...................................................................... Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@471 PS6, Line 471: IcebergColumn col = getColumnByIcebergF > - ndv > getTTableStats().getNum_rows(): getTTableStats().getNum_rows() is a I'm a little bit concern if ndv > cardinality. Some planner code make assumption about ndv vs cardinality like here: https://github.com/apache/impala/blob/07d44b7affa0652f14d8044cd7ffa604f250b77b/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java#L711-L722 https://github.com/apache/impala/blob/07d44b7affa0652f14d8044cd7ffa604f250b77b/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java#L106-L110 For regular HMS table, TTableStats.num_rows is coming from table properties numRows from HMS, which populated by COMPUTE STATS or sometimes overridden by user themself. For Iceberg table, is it following the metadata files or HMS table property? I'm guessing it follows metadata files? https://github.com/apache/impala/blob/07d44b7affa0652f14d8044cd7ffa604f250b77b/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java#L383 http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java File fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java: http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@58 PS6, Line 58: this.blob = blo > It comes from input outside of Impala, so it could be negative (OTOH I don' Done -- To view, visit http://gerrit.cloudera.org:8080/21605 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I50c1228988960a686d08a9b2942e01e366678866 Gerrit-Change-Number: 21605 Gerrit-PatchSet: 8 Gerrit-Owner: Daniel Becker <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Thu, 22 Aug 2024 15:07:07 +0000 Gerrit-HasComments: Yes
