Hello, I launched a job with a larger load on hadoop yarn cluster. The Job finished after running 5 hours, I didn't find any error from JobManger log besides this connect exception.
*2021-02-20 13:20:14,110 WARN akka.remote.transport.netty.NettyTransport - Remote connection to [/10.1.57.146:48368 <http://10.1.57.146:48368>] failed with java.io.IOException: Connection reset by peer2021-02-20 13:20:14,110 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink-metrics@host:35241] has failed, address is now gated for [50] ms. Reason: [Disassociated] 2021-02-20 13:20:14,110 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink@host:39493] has failed, address is now gated for [50] ms. Reason: [Disassociated] 2021-02-20 13:20:14,110 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink-metrics@host:38481] has failed, address is now gated for [50] ms. Reason: [Disassociated] * Any idea what caused the job to be finished and how to resolve it? Any suggestions are appreciated. Thanks Best regards Rainie