pnowojski commented on code in PR #19993: URL: https://github.com/apache/flink/pull/19993#discussion_r899899066
########## flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateWriteRequest.java: ########## @@ -109,6 +112,9 @@ static ChannelStateWriteRequest buildFutureWriteRequest( } }, throwable -> { + if (!dataFuture.isDone()) { + return; + } Review Comment: > Another question I would ask is why this can even result in a TM crash; shouldn't the waiting be interrupted instead. Why should it be interrupted? We are only using interrupts to wake up user code or 3rd party libraries. Our own code should be able to shutdown cleanly without interruptions. We even explicitly disallow SIGINTs during `StreamTask` cleanup (`StreamTask#disableInterruptOnCancel`), once task thread exists from user code as otherwise this could lead to resource leaks. If we can not clean up resources, we have to relay on the `TaskCancelerWatchDog` that will fail over whole TM after a time out. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org