Hi John Yes, the whole TaskManager exited because the task did not react to cancelling signal in time
``` 2022-08-30 09:14:22,138 ERROR org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Task did not exit gracefully within 180 + seconds. org.apache.flink.util.FlinkRuntimeException: Task did not exit gracefully within 180 + seconds. at org.apache.flink.runtime.taskmanager.Task$TaskCancelerWatchDog.run(Task.java:1791) [flink-dist_2.12-1.14.4.jar:1.14.4] at java.lang.Thread.run(Thread.java:750) [?:1.8.0_342] 2022-08-30 09:14:22,139 ERROR org.apache.flink.runtime.taskexecutor.TaskManagerRunner [] - Fatal error occurred while executing the TaskManager. Shutting it down... ``` And the task stack logged such as below when cancelling the sink task ``` 2022-08-30 09:14:22,135 WARN org.apache.flink.runtime.taskmanager.Task [] - Task 'Sink: jdbc (1/1)#359' did not react to cancelling signal - notifying TM; it is stuck for 180 seconds in method: java.net.SocketInputStream.socketRead0(Native Method) java.net.SocketInputStream.socketRead(SocketInputStream.java:116) java.net.SocketInputStream.read(SocketInputStream.java:171) java.net.SocketInputStream.read(SocketInputStream.java:141) com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:2023) com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:6418) com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(IOBuffer.java:7579) com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.doExecutePreparedStatement(SQLServerPreparedStatement.java:592) com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement$PrepStmtExecCmd.doExecute(SQLServerPreparedStatement.java:524) com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7194) com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2979) com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:248) com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:223) com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.execute(SQLServerPreparedStatement.java:505) com.xxxxxx.common.flink.connectors.jdbc.xxxxxxJdbcJsonOutputFormat.flush(xxxxxxJdbcJsonOutputFormat.java:111) com.xxxxxx.common.flink.connectors.jdbc.xxxxxxJdbcJsonSink.snapshotState(xxxxxxJdbcJsonSink.java:33) ``` Best, Congxian John Smith <java.dev....@gmail.com> 于2022年9月23日周五 23:35写道: > Sorry new file: > https://www.dropbox.com/s/mm9521crwvevzgl/flink-flink-taskexecutor-274-flink-prod-v-task-0001.log?dl=0 > > On Fri, Sep 23, 2022 at 11:26 AM John Smith <java.dev....@gmail.com> > wrote: > >> Hi I have attached the logs here... >> >> >> https://www.dropbox.com/s/12gwlps52lvxdhz/flink-flink-taskexecutor-274-flink-prod-v-task-0001.log?dl=0 >> >> 1- It looks like a timeout issue. Can someone confirm? >> 2- The task manager is restarted, since I have restart on failure in >> SystemD. But it seems after a few restarts it stops. Does it mean that >> SystemD has an internal counter of how many times it will restart a service >> before it doesn't do it anymore? >> >