Please find answers inline Our understanding is to stop job with savepoint, all the task manager will persist their state during savepoint. If a Task Manager receives a shutdown signal while savepoint is being taken, does it complete the savepoint before shutdown ? [Ans ] Why task manager is shutdown suddenly? Are you saying about handling unpredictable shutdown while taking savepoint? In that case You can also use retained check point. In case current checkpoint has issues because of shutdown you will have previous checkpoint. So that you can use it. Now you will have 2 options, either savepoint/checkpoint. One of them will always be available.
The job manager K8S service is configured as remote job manager address in Task Manager. This service may not be available during savepoint, will this affect the communication between Task Manager and Job Manager during savepoint ? [Ans] you can go for HA right? Where you can run more than one jobmanager so that one is always service is available On Fri, Mar 13, 2020 at 2:40 PM shravan <mysore.damoda...@microfocus.com> wrote: > Job Manager , Task Manager are run as separate pods within K8S cluster in > our setup. As job cluster is not used, job jars are not part of Job Manager > docker image. The job is submitted from a different Flink client pod. Flink > is configured with RocksDB state backend. The docker images are created by > us as the base OS image needs to be compliant to our organization > guidelines. > > We are looking for a reliable approach to stop the job with savepoint > during > graceful shutdown to avoid duplicates on restart. > The Job Manager pod traps shutdown signal and stops all the jobs with > savepoints. The Flink client pod starts the job with savepoint on restart > of > client pod. But as the order in which pods will be shutdown is not > predictable, we have following queries, > 1. Our understanding is to stop job with savepoint, all the task > manager > will persist their state during savepoint. If a Task Manager receives a > shutdown signal while savepoint is being taken, does it complete the > savepoint before shutdown ? > 2. The job manager K8S service is configured as remote job manager > address > in Task Manager. This service may not be available during savepoint, will > this affect the communication between Task Manager and Job Manager during > savepoint ? > > Can you provide some pointers on the internals of savepoint in Flink ? > > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >