Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/21803 )
Change subject: IMPALA-915: Support cancel queries in frontend ...................................................................... Patch Set 45: (7 comments) http://gerrit.cloudera.org:8080/#/c/21803/43//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21803/43//COMMIT_MSG@19 PS43, Line 19: : > Is this still relevant? I don't see changes about this in the patch? It was moved to the following patch, fixed. http://gerrit.cloudera.org:8080/#/c/21803/43//COMMIT_MSG@52 PS43, Line 52: - update > Please also run tests/query_test/test_cancellation.py exhaustively. Done. The changes to HS2 vs Beeswax server were just to remove a function argument, so I don't expect differences there. http://gerrit.cloudera.org:8080/#/c/21803/43/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/21803/43/be/src/service/client-request-state.cc@290 PS43, Line 290: // Don't start executing the query if Cancel() was called between planning and Exec(). : RETURN_IF_CANCELLED(this); > This and other places where RETURN_IF_CANCELLED newly added will obtain loc This one should be good, a number of the following functions acquire lock_ and/or expiration_data_lock_. http://gerrit.cloudera.org:8080/#/c/21803/43/be/src/service/client-request-state.cc@625 PS43, Line 625: RETURN_IF_CANCELLED(this); Calls UpdateNonErrorExecState or FinishExecQueryOrDmlRequest right after, which acquire lock_. http://gerrit.cloudera.org:8080/#/c/21803/43/be/src/service/client-request-state.cc@1193 PS43, Line 1193: RETURN_IF_CANCELLED(this); This is called in the same context as Exec(), so same argument applies. http://gerrit.cloudera.org:8080/#/c/21803/43/be/src/service/client-request-state.cc@1298 PS43, Line 1298: RETURN_IF_CANCELLED(this); lock_ was acquired earlier in this function too. http://gerrit.cloudera.org:8080/#/c/21803/43/fe/src/main/java/org/apache/impala/service/Canceller.java File fe/src/main/java/org/apache/impala/service/Canceller.java: http://gerrit.cloudera.org:8080/#/c/21803/43/fe/src/main/java/org/apache/impala/service/Canceller.java@44 PS43, Line 44: // Registers threads by their query ID so the threads can be interrupted on : // cancellation. : private static ConcurrentHashMap<TUniqueId, Thread> : queryThreads_ = new ConcurrentHashMap<>(); : // Registers that the specified thread has been cancelled and should clean up. : private static Set<Thread> cancelledThreads_ = ConcurrentHashMap.newKeySet(); > This make sense. But I wonder if they should just be wrapped into a class w I'm having trouble visualizing that with the need to also be able to tell when a thread is cancelled without knowing the query ID. We could have a map of queryId -> ThreadContext, and do unregister/cancel via ThreadContext.unregister/cancel (with synchronization on the ThreadContext object). But to identify when the current thread is cancelled, we need to maintain a second collection of cancelled threads, and as soon as we do that I think we bring back many of the same problems. -- To view, visit http://gerrit.cloudera.org:8080/21803 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d25d4c7fb0b8dcc7dad9510db1e8dca220eeb86 Gerrit-Change-Number: 21803 Gerrit-PatchSet: 45 Gerrit-Owner: Michael Smith <[email protected]> Gerrit-Reviewer: Andrew Sherman <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Fri, 06 Jun 2025 19:04:39 +0000 Gerrit-HasComments: Yes
