Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/21959 )
Change subject: IMPALA-13370: Read Puffin stats from metadata.json property if available ...................................................................... Patch Set 1: (5 comments) Thanks for working on this! http://gerrit.cloudera.org:8080/#/c/21959/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21959/1//COMMIT_MSG@21 PS1, Line 21: Puffin stats file Please add comment about the newly added stats file. AFAICS the stats file also contain the correct blob metadata. Can we also have a test where the stats file is complete gibberish? http://gerrit.cloudera.org:8080/#/c/21959/1/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java File fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java: http://gerrit.cloudera.org:8080/#/c/21959/1/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@83 PS1, Line 83: for (StatisticsFile statsFile : statsFiles) { : loader.loadStatsFromFile(statsFile); : } The logic looks good to me but I think it could be made much simpler by having two for-loops: for (StatisticsFile statsFile : statsFiles) { loader.loadStatsFromMetadataBlobs(statsFile); } for (StatisticsFile statsFile : statsFiles) { loader.loadStatsFromFile(statsFile); } And in the second loop we only load the stats for field IDs that are not in 'result_'. No need for blob partitioning, etc. http://gerrit.cloudera.org:8080/#/c/21959/1/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@154 PS1, Line 154: private static class BlobPartitioning { Please add a comment about this class, also its fields. http://gerrit.cloudera.org:8080/#/c/21959/1/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@168 PS1, Line 168: private BlobPartitioning partitionMetadataBlobs( Please add comment http://gerrit.cloudera.org:8080/#/c/21959/1/fe/src/main/java/org/apache/impala/catalog/PuffinStatsLoader.java@264 PS1, Line 264: if (blobMetadata.fields().size() != 1) return false; Could you add a test where there more fields than one? -- To view, visit http://gerrit.cloudera.org:8080/21959 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5e92056ce97c4849742db6309562af3b575f647b Gerrit-Change-Number: 21959 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 19 Nov 2024 16:45:44 +0000 Gerrit-HasComments: Yes
