Peter Rozsa has posted comments on this change. ( http://gerrit.cloudera.org:8080/24071 )
Change subject: IMPALA-14755:(part 1) Implement Puffin Blob reader and File writer ...................................................................... Patch Set 5: (5 comments) I changed the dataflow to use hashes everywhere when it's possible, the only part that requires real paths is the returning part to the coordinator where we must supply read Puffin file paths. http://gerrit.cloudera.org:8080/#/c/24071/4/be/src/exec/output-partition.h File be/src/exec/output-partition.h: http://gerrit.cloudera.org:8080/#/c/24071/4/be/src/exec/output-partition.h@130 PS4, Line 130: > nit: requires +4 indent Done http://gerrit.cloudera.org:8080/#/c/24071/4/be/src/exec/puffin/puffin-writer.cc File be/src/exec/puffin/puffin-writer.cc: http://gerrit.cloudera.org:8080/#/c/24071/4/be/src/exec/puffin/puffin-writer.cc@60 PS4, Line 60: (p > Should we call Close() just in case? It can cause double closes, I used the pattern from HdfsParquetTableWriter, and HdfsTextTableWriter, I could add a IsClosed check here, but it would break the scheme used in other writers. http://gerrit.cloudera.org:8080/#/c/24071/4/be/src/exec/puffin/puffin-writer.cc@137 PS4, Line 137: > I think we should start this timer after we finished loading the old DVs. Done http://gerrit.cloudera.org:8080/#/c/24071/3/be/src/util/hash-util.h File be/src/util/hash-util.h: http://gerrit.cloudera.org:8080/#/c/24071/3/be/src/util/hash-util.h@338 PS3, Line 338: } else { : temp >>= 1; : } : } : return temp; : } : : t > I think the current version (PS4) is OK, the namespace is annotated. Ack http://gerrit.cloudera.org:8080/#/c/24071/3/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/24071/3/common/thrift/CatalogObjects.thrift@679 PS3, Line 679: THash1 > I think there're two possiblities: I restored the map to THash128, TIcebergDeletionVector, it's possible to use hashes everywhere and only use paths for the created Puffin files. -- To view, visit http://gerrit.cloudera.org:8080/24071 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I068a071f9db907064ccec8568db5234863eb4587 Gerrit-Change-Number: 24071 Gerrit-PatchSet: 5 Gerrit-Owner: Peter Rozsa <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 17 Mar 2026 15:41:06 +0000 Gerrit-HasComments: Yes
