Hi Xiangyu, There might be different reasons for the "Job Leader... lost leadership" problem. Do you see the erros in the TM log ? If so, the root cause might be that the connection between the TM and ZK is lost or timeout. Have you checked the GC status of the TM side ? If the GC is ok, could you provide more detailed exception stack ?
Best, Yun ------------------Original Mail ------------------ Sender:Xiangyu Su <xian...@smaato.com> Send Date:Wed Sep 1 15:31:03 2021 Recipients:user <user@flink.apache.org> Subject:FLINK-14316 happens on version 1.13.2 Hello Everyone, We upgrade flink to 1.13.2, and we were facing randomly the "Job leader ... lost leadership" error, the job keep restarting and failing... It behaviours like this ticket https://issues.apache.org/jira/browse/FLINK-14316 Did anybody had same issue or any suggestions? Best Regards, -- Xiangyu Su Java Developer xian...@smaato.com Smaato Inc. San Francisco - New York - Hamburg - Singapore www.smaato.com Germany: Barcastraße 5 22087 Hamburg GermanyM 0049(176)43330282 The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files.