Hi Robert,
Any solution / alternate approach to above issue would be appreciated as
going live with new jobs will be unreliable w.r.t task manager going down.
On Fri, Sep 10, 2021 at 1:17 PM Puneet Duggal
wrote:
> Hi Robert,
>
> Thanks for taking out time to go through the logs.
>
> Problem:
>
Hi Robert,
Thanks for taking out time to go through the logs.
Problem:
So reason for restarting all the task managers was to incorporate increased jvm
metaspace size for each existing task manager. Currently each taskmanager has
32 slots. But JVM metaspace size was 256 MB which used to get fil
Thanks for the log.
>From the partial log that you shared with me, my assumption is that some
external resource manager is shutting down your cluster. Multiple
TaskManagers are disconnecting, and finally the job is switching into
failed state.
It seems that you are not stopping only one TaskManger
Hi,
Please find attached logfile regarding job not getting restarted on another
task manager once existing task manager got restarted.
Just FYI - We are using Fixed Delay Restart (5 times, 10s delay)
On Thu, Sep 9, 2021 at 4:29 PM Robert Metzger wrote:
> Hi Puneet,
>
> Can you provide us with
Hi Puneet,
Can you provide us with the JobManager logs of this incident? Jobs should
not disappear, they should restart on other Task Managers.
On Wed, Sep 8, 2021 at 3:06 PM Puneet Duggal
wrote:
> Hi,
>
> So for past 2-3 days i have been looking for documentation which
> elaborates how flink t
Hi,
So for past 2-3 days i have been looking for documentation which elaborates how
flink takes care of restarting the data streaming job. I know all the restart
and failover strategies but wanted to know how different components (Job
Manager, Task Manager etc) play a role while restarting the