Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21683 )
Change subject: IMPALA-13304: Include aggregate instance-level metrics within experimental profile(V2) ...................................................................... Patch Set 6: (11 comments) http://gerrit.cloudera.org:8080/#/c/21683/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21683/2//COMMIT_MSG@80 PS2, Line 80: } : : When no. of instances <= 5 - > Once the JSON's structure is refined after some reviews, I will add the tes Done http://gerrit.cloudera.org:8080/#/c/21683/6/be/src/util/runtime-profile.h File be/src/util/runtime-profile.h: http://gerrit.cloudera.org:8080/#/c/21683/6/be/src/util/runtime-profile.h@940 PS6, Line 940: typedef std::map<std::string, AggEventSequence> EventSequenceMap; : EventSequenceMap event_sequence_map_; This will duplicate EventSequenceMap typedef in L786. Maybe rename this into AggEventSequenceMap ? Please also put comment what the key is, both here and for typedef in L786. http://gerrit.cloudera.org:8080/#/c/21683/2/be/src/util/runtime-profile.h File be/src/util/runtime-profile.h: http://gerrit.cloudera.org:8080/#/c/21683/2/be/src/util/runtime-profile.h@940 PS2, Line 940: typedef std::map<std::string, AggEventSequence> EventSequenceMap; : EventSequenceMap event_sequence_map_; > I had done it for consistency. I will remove it in the next patchset, if no Ack http://gerrit.cloudera.org:8080/#/c/21683/2/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/21683/2/be/src/util/runtime-profile.cc@2927 PS2, Line 2927: // If number of instances is more than 5, aggregate of event timestamps (i.e. Average, : // Minimum and Maximum) is included after bucketing the instance timestamps into : // 5 divisions each spanning 20%. > To be consistent with the 'ToJson' method for the traditional profile above Ack http://gerrit.cloudera.org:8080/#/c/21683/6/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/21683/6/be/src/util/runtime-profile.cc@2952 PS6, Line 2952: vector<vector<vector<int64_t>>> timestamps_vec; : vector<vector<Value>> labels_vec; Please put comment what each index in vector represent. For example, timestamps_vec is a 3-level nested vector. What does each level represent? Maybe like this: // Vector of timestamps addressed as [unique_event_sequence][instances][event_timestamps] vector<vector<vector<int64_t>>> timestamps_vec; // Vector of labels addressed as [unique_event_sequence][event_labels] vector<vector<Value>> labels_vec; http://gerrit.cloudera.org:8080/#/c/21683/6/be/src/util/runtime-profile.cc@2968 PS6, Line 2968: timestamps_vec.emplace_back(event_sequence_contents.timestamps); This does not seem to consult against event_sequence_contents.label_idxs, so you might get different ordering vs labels_vec? http://gerrit.cloudera.org:8080/#/c/21683/6/be/src/util/runtime-profile.cc@2985 PS6, Line 2985: if (timestamps_vec[es_idx].size() > BUCKET_SIZE) { Please put comment right after this line describing what this big branch does. Something like this maybe: "This is a big query profile with many AggEventSequence. Aggregate them into a bucket of 5." nit: might be worth it to contain the branch content into its own function. http://gerrit.cloudera.org:8080/#/c/21683/6/be/src/util/runtime-profile.cc@2988 PS6, Line 2988: int64_t min = timestamps_vec[es_idx][0][event_idx]; : int64_t max = min Please rename to min_ts and max_ts to avoid confusion with min and max function. http://gerrit.cloudera.org:8080/#/c/21683/6/be/src/util/runtime-profile.cc@3021 PS6, Line 3021: if (inst_count_list[division_idx] == 0) { Put comment on why this branch is needed. I assume this is to anticipate for missing events/timestamps? http://gerrit.cloudera.org:8080/#/c/21683/6/be/src/util/runtime-profile.cc@3034 PS6, Line 3034: Value event(kObjectType); : event.AddMember("label", labels_vec[es_idx][event_idx], allocator); Move this below, after L3040. http://gerrit.cloudera.org:8080/#/c/21683/6/tests/observability/test_profile_tool.py File tests/observability/test_profile_tool.py: http://gerrit.cloudera.org:8080/#/c/21683/6/tests/observability/test_profile_tool.py@84 PS6, Line 84: ) > flake8: E501 line too long (91 > 90 characters) You can replace the string with variable to workaround the "line to long" issue. -- To view, visit http://gerrit.cloudera.org:8080/21683 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I49e18a7a7e1288e3e674e15b6fc86aad60a08214 Gerrit-Change-Number: 21683 Gerrit-PatchSet: 6 Gerrit-Owner: Surya Hebbar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Surya Hebbar <[email protected]> Gerrit-Comment-Date: Wed, 16 Oct 2024 18:42:58 +0000 Gerrit-HasComments: Yes
