Surya Hebbar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24123 )

Change subject: IMPALA-14796: Show effective runtime filter targets in profile
......................................................................


Patch Set 4:

(6 comments)

Thank you! This is a great addition to the query profile.

I've left a few inline comments for your consideration, primarily focusing on 
state capture, exception safety during the traversal, and a minor suggestion 
regarding test placement.

Thank you for the changes again!

http://gerrit.cloudera.org:8080/#/c/24123/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/24123/4//COMMIT_MSG@25
PS4, Line 25: BTW,
Consider rewording the "BTW..." sentence in the commit message to be slightly 
more formal. Maybe something like, "Additionally, removed temp_object_pool and 
temp_mem_tracker from FilterDebugString() as they have been unused since commit 
a985e11."


http://gerrit.cloudera.org:8080/#/c/24123/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/24123/4/be/src/runtime/coordinator.cc@684
PS4, Line 684: row.push_back(effective_target_ids.empty() ? "N" : 
join(effective_target_ids, ", "));
nit: Using "N/A" or "-" instead of "N" for empty effective targets maybe 
better, as this seems to match standard table formatting conventions. The 
Pending and Completed columns already use "N/A", so this might keep the visual 
language a bit more consistent.


http://gerrit.cloudera.org:8080/#/c/24123/4/be/src/runtime/coordinator.cc@794
PS4, Line 794: [&current_node_id, &find_effective_filters, this]
The `current_node_id` is captured by reference here.

During the recursive traversal of the profile tree, if a child updates 
current_node_id, that modified value will bleed over into the processing of 
subsequent siblings of the parent node.

Please consider passing current_node_id as a value argument to the lambda 
find_effective_filters(RuntimeProfileBase* profile, TPlanNodeId 
current_node_id) to ensure the ID is isolated and correct for each branch of 
the tree.


http://gerrit.cloudera.org:8080/#/c/24123/4/be/src/runtime/coordinator.cc@797
PS4, Line 797: split(fields, profile_name, is_any_of(" "));
nit: Using boost::split to populate a vector<string> allocates memory for both 
the vector and the individual strings for every single node in the profile tree.

Since we only need to check the prefix/suffix and extract an ID, consider using 
profile_name.find(" ") and StringPiece to locate the components without heap 
allocations.


http://gerrit.cloudera.org:8080/#/c/24123/4/be/src/runtime/coordinator.cc@815
PS4, Line 815: std::stoi
std::stoi throws exceptions (std::invalid_argument, std::out_of_range) if 
parsing fails.

If a profile string is malformed or unexpectedly changes formatting in the 
future, this could crash the coordinator daemon.

Consider using StringParser::StringToInt<int32_t> (from util/string-parser.h) 
or a similar non-throwing alternative to safely parse these IDs and simply skip 
or log on failure.


http://gerrit.cloudera.org:8080/#/c/24123/4/tests/custom_cluster/test_runtime_profile.py
File tests/custom_cluster/test_runtime_profile.py:

http://gerrit.cloudera.org:8080/#/c/24123/4/tests/custom_cluster/test_runtime_profile.py@54
PS4, Line 54: test_tpch_q5_final_filter_table
The test_runtime_profile.py seems to be exclusively to verify the runtime 
profile generation itself and not to verify certain parts of the profile.

I believe running it with gen_experimental_profile argument in the 
test_runtime_filter.py itself is the correct way.

In tests such as test_query_log.py, separate tests have been written to test 
profile V2, instead of putting it here.

https://github.com/apache/impala/blob/772ebd2273a5cae89df2099b51d7e56ed464c31d/tests/custom_cluster/test_query_log.py#L608-L612

Please consider adding the test in test_runtime_filter.py in a similar way as 
above.

Also, I believe similar to test_runtime_filter.py's test, we only need to run 
it on parquet.



--
To view, visit http://gerrit.cloudera.org:8080/24123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccf4b87ac4579a70273f3306ec7b58850f06b17c
Gerrit-Change-Number: 24123
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Surya Hebbar <[email protected]>
Gerrit-Comment-Date: Fri, 27 Mar 2026 20:07:20 +0000
Gerrit-HasComments: Yes

Reply via email to