[ https://issues.apache.org/jira/browse/FLINK-9776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16535780#comment-16535780 ]
ASF GitHub Bot commented on FLINK-9776: --------------------------------------- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/6275#discussion_r200814568 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java --- @@ -1563,7 +1573,7 @@ public void run() { // log stack trace where the executing thread is stuck and // interrupt the running thread periodically while it is still alive - while (executerThread.isAlive()) { + while (task.shouldInterruptOnCancel() && executerThread.isAlive()) { --- End diff -- True, this is no 100% guarantee that interrupts do not come. That would need an atomic "interrupt if flag is set call", but I don't know if that is possible in Java without introducing a locked code block, which I wanted to avoid. It may also not be necessary. I think the variant here is already strictly better than the current state, which is correct already. The current state mainly suffers from shutdowns "looking rough" due to interruptions. This change should the majority of that, because in the vast majority of shutdowns, the thread exits before the first of the "repeated interrupts". The thread only experiences the initial interrupt. In some sense, only clearing the initial interrupt flag would probably help > 90% of the cases already. This solves a few more % of the cases by guarding the repeated interrupts. > Interrupt TaskThread only while in User/Operator code > ----------------------------------------------------- > > Key: FLINK-9776 > URL: https://issues.apache.org/jira/browse/FLINK-9776 > Project: Flink > Issue Type: Improvement > Components: Local Runtime > Reporter: Stephan Ewen > Assignee: Stephan Ewen > Priority: Major > Labels: pull-request-available > Fix For: 1.6.0 > > > Upon cancellation, the task thread is periodically interrupted. > This helps to pull the thread out of blocking operations in the user code. > Once the thread leaves the user code, the repeated interrupts may interfere > with the shutdown cleanup logic, causing confusing exceptions. > We should stop sending the periodic interrupts once the thread leaves the > user code. -- This message was sent by Atlassian JIRA (v7.6.3#76005)