Hi Dmitry, Elias, You raise a valid point, and thanks for opening https://issues.apache.org/jira/browse/KAFKA-4748 <https://issues.apache.org/jira/browse/KAFKA-4748> Elias. We'll hopefully have some ideas to share soon.
Eno > On 9 Feb 2017, at 16:54, Dmitry Minkovsky <dminkov...@gmail.com> wrote: > > That makes sense. That's what I was kind of worried about (launching soon). > Hope someone else posts! > > ср, 8 февр. 2017 г. в 16:54, Elias Levy <fearsome.lucid...@gmail.com>: > >> It is certainly possible, but when you got dozens of workers, that would >> take a very long time, specially if you got a lot of state, as partitions >> get reassigned and state moved about. In fact, it is likely to fail at >> some point, as local state that can be stored in a multitude of nodes may >> not be able to be stored locally as the number of nodes becomes smaller. >> >> On Wed, Feb 8, 2017 at 12:34 PM, Dmitry Minkovsky <dminkov...@gmail.com> >> wrote: >> >>> Can you take them down sequentially? Like, say, with a Kubernetes >>> StatefulSet >>> < >> https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful- >>> set/#ordered-pod-termination> >>> . >>> >>> On Wed, Feb 8, 2017 at 2:15 PM, Elias Levy <fearsome.lucid...@gmail.com> >>> wrote: >>> >>>> What are folks doing to cleanly shutdown a Streams job comprised of >>>> multiple workers? >>>> >>>> Right now I am doing sys.addShutdownHook(streams.close()) but that is >>> not >>>> working well to shutdown a fleet of workers. When I signal the fleet >> to >>>> shutdown by sending them all a SIGTERM, some of them will shutdown, but >>>> some will persist. It appears that there is a race condition between >> the >>>> shutdown signal and a rebalancing occurring as a result of other >> workers >>>> shutting down. If a worker has not started shutting down before the >>>> rebalancing starts, the rebalancing will cause the worker to not >>> shutdown. >>>> >>>> Others seeing the same thing? >>>> >>> >>