Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/21959 )
Change subject: IMPALA-13370: Read Puffin stats from metadata.json property if available ...................................................................... Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21959/1/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java File fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java: http://gerrit.cloudera.org:8080/#/c/21959/1/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@83 PS1, Line 83: for (StatisticsFile statsFile : statsFiles) { : loader.loadStatsFromFile(statsFile); : } > The logic looks good to me but I think it could be made much simpler by hav What if there are two StatisticsFile object in the metadata, with the same stats, and the first one doesn't have NDV in properties, but the second one has? (e.g. two different engines created Puffin files for the same snapshot) IIUC with the original logic we read the statistic from the blob, even though we could retrieve the NDV from the metadata. We could also have a test for this. -- To view, visit http://gerrit.cloudera.org:8080/21959 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5e92056ce97c4849742db6309562af3b575f647b Gerrit-Change-Number: 21959 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 19 Nov 2024 17:00:30 +0000 Gerrit-HasComments: Yes
