Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/21597 )
Change subject: IMPALA-13246: Smallify strings during broadcast exchange ...................................................................... Patch Set 4: (5 comments) Thanks for working on this! http://gerrit.cloudera.org:8080/#/c/21597/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21597/4//COMMIT_MSG@14 PS4, Line 14: it turned out that this is not true in : Iceberg's delete's directed shuffle Could you please add more details for this? In DIRECTED mode I wouldn't expect any smallifications as paths are usally larger than 11 characters. http://gerrit.cloudera.org:8080/#/c/21597/4//COMMIT_MSG@17 PS4, Line 17: Measured >1% improvement in TPCH(42). Above you mentioned that removing Tuple::SmallifyStrings() makes TPC-H slower. Do you mean that removing Tuple::SmallifyStrings() is a bit slower, but still this change is faster in overall? Please add clarification. http://gerrit.cloudera.org:8080/#/c/21597/4/be/src/runtime/tuple.h File be/src/runtime/tuple.h: http://gerrit.cloudera.org:8080/#/c/21597/4/be/src/runtime/tuple.h@123 PS4, Line 123: =false IMO the default parameter makes the code a bit unsafe, as it is too easy to write TotalByteSize(tuple_desc) without thinking about what should be the value for 'assume_smallify'. Also, TotalByteSize() doesn't seem to be called in so many places that would justify the default parameter. http://gerrit.cloudera.org:8080/#/c/21597/4/be/src/runtime/tuple.h@140 PS4, Line 140: nit: redundant space http://gerrit.cloudera.org:8080/#/c/21597/4/be/src/runtime/tuple.cc File be/src/runtime/tuple.cc: http://gerrit.cloudera.org:8080/#/c/21597/4/be/src/runtime/tuple.cc@68 PS4, Line 68: string_val->Len() <= SmallableString::SMALL_LIMIT nit: you could add a method 'PossibleToSmallify()' or 'FitToSmall()' to StringValue -- To view, visit http://gerrit.cloudera.org:8080/21597 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I281d77c7a241ebfe8716eb5c975f0660601aec1b Gerrit-Change-Number: 21597 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Mon, 12 Aug 2024 17:17:57 +0000 Gerrit-HasComments: Yes
