Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21605 )

Change subject: IMPALA-13247: Support Reading Puffin files for the current 
snapshot
......................................................................


Patch Set 8: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@471
PS6, Line 471:   IcebergColumn col = getColumnByIcebergF
> I'm a little bit concern if ndv > cardinality. Some planner code make assum
numRows are precise for Iceberg tables without delete files. If the table has 
delete files, it's not trivial to get a precise value as there can be 
duplicated delete records (should be rare in real life usage).


http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java
File fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java:

http://gerrit.cloudera.org:8080/#/c/21605/6/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@70
PS6, Line 70: }
            :       return result;
            :     } catch (Exception e) {
            :       // We add a catch-all clause here because Puffin stats 
loading shouldn't be an
> This CATCH should normally never be triggered, but I added it in case an un
In loadStatsFromFile() we swallow some exceptions so healthy puffin files can 
still contribute to 'results'. So it feels a bit inconsistent and overly strict 
to me as here we throw away all stats, even stats coming from healthy puffin 
files.
The Iceberg library loves unchecked exceptions, so it's possible we throw away 
stats on some minor error.

Serious errors like OutOfMemoryError will fly through anyway.


http://gerrit.cloudera.org:8080/#/c/21605/8/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java
File fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java:

http://gerrit.cloudera.org:8080/#/c/21605/8/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@75
PS8, Line 75: "Unexpected error happened while loading Puffin column 
statistics."
Please also log the exception.



--
To view, visit http://gerrit.cloudera.org:8080/21605
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50c1228988960a686d08a9b2942e01e366678866
Gerrit-Change-Number: 21605
Gerrit-PatchSet: 8
Gerrit-Owner: Daniel Becker <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 22 Aug 2024 16:03:00 +0000
Gerrit-HasComments: Yes

Reply via email to