subject:"Re\: Recovery problem 1 of 2 in Flink 1.6.3"

Re: Recovery problem 1 of 2 in Flink 1.6.3

2019-01-15 Thread Till Rohrmann

Hi John, this is definitely not how Flink should behave in this situation and could indicate a bug. From the logs I couldn't figure out the problem. Would it be possible to obtain for the TMs and JM the full logs with DEBUG log level? This would help me to further debug the problem. Cheers, Till

Re: Recovery problem 1 of 2 in Flink 1.6.3

2019-01-14 Thread John Stone

Is this a known issue? Should I create a Jira ticket? Does anyone have anything they would like me to try? I’m very lost at this point. I’ve now seen this issue happen without destroying pods, i.e. the job running crashes after several hours and fails to recover once all task slots are consu