Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/22422 )
Change subject: IMPALA-13703: Cancel running queries before shutdown deadline ...................................................................... Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/22422/7/be/src/runtime/query-exec-mgr.cc File be/src/runtime/query-exec-mgr.cc: http://gerrit.cloudera.org:8080/#/c/22422/7/be/src/runtime/query-exec-mgr.cc@288 PS7, Line 288: bool QueryExecMgr::CancelQueriesForGracefulShutdown() { We could maybe consolidate the logic for `CancelQueriesForGracefulShutdown` and `CancelQueriesForFailedCoordinators` using a common helper function like `CancelQueriesHelper`. We could maybe pass 'is_coordinator_active' as a parameter? http://gerrit.cloudera.org:8080/#/c/22422/7/be/src/runtime/query-exec-mgr.cc@303 PS7, Line 303: query_num_to_cancel = remaining_queue_size; In the graceful shutdown case we're not calling 'ReleaseQueryState'? Is that because these queries might still finish on their own before the shutdown deadline is hit? http://gerrit.cloudera.org:8080/#/c/22422/7/be/src/service/cancellation-work.h File be/src/service/cancellation-work.h: http://gerrit.cloudera.org:8080/#/c/22422/7/be/src/service/cancellation-work.h@32 PS7, Line 32: // The query is being terminated because a backend failed. We can skip cancelling the : // query if the fragment instances running on that backend all completed. : BACKEND_FAILED, : // The Impala Server terminated the query during shutdown and the query has not been : // done yet when the graceful shutdown deadline is reached. : GRACEFUL_SHUTDOWN > Could we have a conflict between these two cancellation flow? In some cases I see that `do_test_shutdown_executor` provides coverage for this scenario. http://gerrit.cloudera.org:8080/#/c/22422/7/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/22422/7/be/src/service/impala-server.h@1676 PS7, Line 1676: std::atomic_bool shutdown_deadline_cancel_queries_; I think I missed where we are consuming this? Also, in case cancellation queue gets full before including Cancellation Tasks for all active queries this will be set to false. Is that okay? Maybe if I can see how we consume this that would clarify things. -- To view, visit http://gerrit.cloudera.org:8080/22422 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cac2e100d329644e21fdceb0b23901b08079130 Gerrit-Change-Number: 22422 Gerrit-PatchSet: 7 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Tue, 04 Feb 2025 21:48:05 +0000 Gerrit-HasComments: Yes
