Re: Flink app cannot restart

2020-07-24 Thread Robert Metzger
Hi Rainie, I believe we need the full JobManager log to understand what's going on with your job. The logs you've provided so far only tell us that a TaskManager has died (which is expected, when a node goes down). What is interesting to see is what's happening next: are we having enough resources

Re: Flink app cannot restart

2020-07-23 Thread Rainie Li
Thank you Yang, I checked "yarn.application-attempts" is already set to 10. Here is the exception part from job manager log. Full log file is too big, I also reflected it to remove some company specific info. Any suggestion to this exception would be appreciated! 2020-07-15 20:04:52,265 INFO org.

Re: Flink app cannot restart

2020-07-22 Thread Yang Wang
Could you check for that whether the JobManager is also running on the lost Yarn NodeManager? If it is the case, you need to configure "yarn.application-attempts" to a value bigger than 1. BTW, the logs you provided are not Yarn NodeManager logs. And if you could provide the full jobmanager log,