Hi Xintong Song,

Thanks for your immediate reply. Yes, I do restart task manager via kill 
command and then flink restart because I have seen cases where simple flink 
restart does not pickup the latest configuration. But what I am confused about 
is why killing the task manager process and then restarting it is causing the 
job manager to stop and restart.

Regards,
Puneet

> On 12-Oct-2022, at 7:33 AM, Xintong Song <tonysong...@gmail.com> wrote:
> 
> The log shows that the jobmanager received a SIGTERM signal from external. 
> Depending on how you deploy Flink, that could be a 'kill <PID>' command, or a 
> kubernetes pod removal / eviction, etc. You may want to check where the 
> signal came from.
> 
> Best,
> Xintong
> 
> 
> On Wed, Oct 12, 2022 at 6:26 AM Puneet Duggal <puneetduggal1...@gmail.com 
> <mailto:puneetduggal1...@gmail.com>> wrote:
> Hi,
> 
> I am facing an issue where when restarting task manager after adding some 
> configuration changes, even though task manager restarts successfully with 
> the updated configuration change, is causing the leader job manager to 
> restart as well. Pasting the leader job manager logs here
> 
> 
> 2022-10-11 22:11:02,207 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@<TM-IP>:35376] has failed, address is now gated for [50] 
> ms. Reason: [Disassociated]
> 2022-10-11 22:11:02,411 WARN  akka.remote.transport.netty.NettyTransport      
>              [] - Remote connection to [null] failed with 
> java.net.ConnectException: Connection refused: /<TM-IP>:35376
> 2022-10-11 22:11:02,413 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@<TM-IP>:35376] has failed, address is now gated for [50] 
> ms. Reason: [Association failed with [akka.tcp://flink@<TM-IP>:35376]] Caused 
> by: [java.net.ConnectException: Connection refused: /<TM-IP>:35376]
> 2022-10-11 22:11:02,682 WARN  akka.remote.transport.netty.NettyTransport      
>              [] - Remote connection to [null] failed with 
> java.net.ConnectException: Connection refused: /<TM-IP>:35376
> 2022-10-11 22:11:02,683 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@<TM-IP>:35376] has failed, address is now gated for [50] 
> ms. Reason: [Association failed with [akka.tcp://flink@<TM-IP>:35376]] Caused 
> by: [java.net.ConnectException: Connection refused: /<TM-IP>:35376]
> 2022-10-11 22:11:12,702 WARN  akka.remote.transport.netty.NettyTransport      
>              [] - Remote connection to [null] failed with 
> java.net.ConnectException: Connection refused: /<TM-IP>:35376
> 2022-10-11 22:11:12,703 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@<TM-IP>:35376] has failed, address is now gated for [50] 
> ms. Reason: [Association failed with [akka.tcp://flink@<TM-IP>:35376]] Caused 
> by: [java.net.ConnectException: Connection refused: /<TM-IP>:35376]
> 2022-10-11 22:11:21,683 INFO  
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - RECEIVED 
> SIGNAL 15: SIGTERM. Shutting down as requested.
> 2022-10-11 22:11:21,687 INFO  org.apache.flink.runtime.blob.BlobServer        
>              [] - Stopped BLOB server at 0.0.0.0:33887 <http://0.0.0.0:33887/>
> 
> 
> Regards,
> Puneet
> 
> 

Reply via email to