The log shows that the jobmanager received a SIGTERM signal from external. Depending on how you deploy Flink, that could be a 'kill <PID>' command, or a kubernetes pod removal / eviction, etc. You may want to check where the signal came from.
Best, Xintong On Wed, Oct 12, 2022 at 6:26 AM Puneet Duggal <puneetduggal1...@gmail.com> wrote: > Hi, > > I am facing an issue where when restarting task manager after adding some > configuration changes, even though task manager restarts successfully with > the updated configuration change, is causing the leader job manager to > restart as well. Pasting the leader job manager logs here > > > 2022-10-11 22:11:02,207 WARN akka.remote.ReliableDeliverySupervisor > [] - Association with remote system > [akka.tcp://flink@<TM-IP>:35376] > has failed, address is now gated for [50] ms. Reason: [Disassociated] > 2022-10-11 22:11:02,411 WARN akka.remote.transport.netty.NettyTransport > [] - Remote connection to [null] failed with > java.net.ConnectException: Connection refused: /<TM-IP>:35376 > 2022-10-11 22:11:02,413 WARN akka.remote.ReliableDeliverySupervisor > [] - Association with remote system > [akka.tcp://flink@<TM-IP>:35376] > has failed, address is now gated for [50] ms. Reason: [Association failed > with [akka.tcp://flink@<TM-IP>:35376]] Caused by: > [java.net.ConnectException: Connection refused: /<TM-IP>:35376] > 2022-10-11 22:11:02,682 WARN akka.remote.transport.netty.NettyTransport > [] - Remote connection to [null] failed with > java.net.ConnectException: Connection refused: /<TM-IP>:35376 > 2022-10-11 22:11:02,683 WARN akka.remote.ReliableDeliverySupervisor > [] - Association with remote system > [akka.tcp://flink@<TM-IP>:35376] > has failed, address is now gated for [50] ms. Reason: [Association failed > with [akka.tcp://flink@<TM-IP>:35376]] Caused by: > [java.net.ConnectException: Connection refused: /<TM-IP>:35376] > 2022-10-11 22:11:12,702 WARN akka.remote.transport.netty.NettyTransport > [] - Remote connection to [null] failed with > java.net.ConnectException: Connection refused: /<TM-IP>:35376 > 2022-10-11 22:11:12,703 WARN akka.remote.ReliableDeliverySupervisor > [] - Association with remote system > [akka.tcp://flink@<TM-IP>:35376] > has failed, address is now gated for [50] ms. Reason: [Association failed > with [akka.tcp://flink@<TM-IP>:35376]] Caused by: > [java.net.ConnectException: Connection refused: /<TM-IP>:35376] > 2022-10-11 22:11:21,683 INFO > org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - RECEIVED > SIGNAL 15: SIGTERM. Shutting down as requested. > 2022-10-11 22:11:21,687 INFO org.apache.flink.runtime.blob.BlobServer > [] - Stopped BLOB server at 0.0.0.0:33887 > > > Regards, > Puneet > > >