Re: Jobmanager fails to come up if the job has an issue

2022-09-26 Thread Matthias Pohl via user
Yes, the JobManager will failover in HA mode and all jobs would be recovered. On Mon, Sep 26, 2022 at 2:06 PM ramkrishna vasudevan < ramvasu.fl...@gmail.com> wrote: > Thanks @Matthias Pohl . This is informative. So > generally in a session cluster if I have more than one job and only one of > t

Re: Jobmanager fails to come up if the job has an issue

2022-09-26 Thread ramkrishna vasudevan
Thanks @Matthias Pohl . This is informative. So generally in a session cluster if I have more than one job and only one of them has this issue, still we will face the same problem? Regards Ram On Mon, Sep 26, 2022 at 4:32 PM Matthias Pohl wrote: > I see. Thanks for sharing the logs. It's rela

Re: Jobmanager fails to come up if the job has an issue

2022-09-26 Thread Matthias Pohl via user
I see. Thanks for sharing the logs. It's related to a FLINK-9097 [1]. In order for the job to not be cleaned up entirely after a failure while submitting the job, the JobManager is failed fatally resulting in a failover. That's what you're experiencing. One solution is to fix the permission issue

Re: Jobmanager fails to come up if the job has an issue

2022-09-26 Thread ramkrishna vasudevan
I got some logs and stack traces from our backend storage. This is not the entire log though. Can this be useful? With these set of logs messages the job manager kept restarting. Regards Ram On Mon, Sep 26, 2022 at 3:11 PM ramkrishna vasudevan < ramvasu.fl...@gmail.com> wrote: > Thank you very

Re: Jobmanager fails to come up if the job has an issue

2022-09-26 Thread ramkrishna vasudevan
Thank you very much for the reply. I have lost the k8s cluster in this case before I could capture the logs. I will try to repro this and get back to you. Regards Ram On Mon, Sep 26, 2022 at 12:42 PM Matthias Pohl wrote: > Hi Ramkrishna, > thanks for reaching out to the Flink community. Could y

Re: Jobmanager fails to come up if the job has an issue

2022-09-26 Thread Matthias Pohl via user
Hi Ramkrishna, thanks for reaching out to the Flink community. Could you share the JobManager logs to get a better understanding of what's going on? I'm wondering why the JobManager is failing when the actual problem is that the job is struggling to access a folder. It sounds like there are multipl

Jobmanager fails to come up if the job has an issue

2022-09-25 Thread ramkrishna vasudevan
Hi all I have a simple job where we read for a given path in cloud storage to watch for new files in a given fodler. While I setup my job there was some permission issue on the folder. The job is STREAMING job. The cluster is set in the session mode and is running on Kubernetes. The job manager si