[ https://issues.apache.org/jira/browse/KAFKA-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
A. Sophie Blee-Goldman resolved KAFKA-4748. ------------------------------------------- Resolution: Fixed > Need a way to shutdown all workers in a Streams application at the same time > ---------------------------------------------------------------------------- > > Key: KAFKA-4748 > URL: https://issues.apache.org/jira/browse/KAFKA-4748 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 0.10.1.1 > Reporter: Elias Levy > Assignee: Walker Carlson > Priority: Major > Fix For: 2.8.0 > > > If you have a fleet of Stream workers for an application and attempt to shut > them down simultaneously (e.g. via SIGTERM and > Runtime.getRuntime().addShutdownHook() and streams.close())), a large number > of the workers fail to shutdown. > The problem appears to be a race condition between the shutdown signal and > the consumer rebalancing that is triggered by some of the workers existing > before others. Apparently, workers that receive the signal later fail to > exit apparently as they are caught in the rebalance. > Terminating workers in a rolling fashion is not advisable in some situations. > The rolling shutdown will result in many unnecessary rebalances and may > fail, as the application may have large amount of local state that a smaller > number of nodes may not be able to store. > It would appear that there is a need for a protocol change to allow the > coordinator to signal a consumer group to shutdown without leading to > rebalancing. -- This message was sent by Atlassian Jira (v8.3.4#803005)