I have a common interest in this topic. My k8s recycle hosts, and I am facing the same issue. Flink can tolerate this situation, but I am wondering if I can do better
On Thu, Jul 11, 2019, 12:39 Aaron Levin <aaronle...@stripe.com> wrote: > Hello, > > Is there a way to gracefully terminate a Task Manager beyond just killing > it (this seems to be what `./taskmanager.sh stop` does)? Specifically I'm > interested in a way to replace a Task Manager that has currently-running > tasks. It would be great if it was possible to terminate a Task Manager > without restarting the job, though I'm not sure if this is possible. > > Context: at my work we regularly cycle our hosts for maintenance and > security. Each time we do this we stop the task manager running on the host > being cycled. This causes the entire job to restart, resulting in downtime > for the job. I'd love to decrease this downtime if at all possible. > > Thanks! Any insight is appreciated! > > Best, > > Aaron Levin >