[ https://issues.apache.org/jira/browse/KAFKA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gwen Shapira updated KAFKA-4051: -------------------------------- Resolution: Fixed Fix Version/s: 0.10.1.0 Status: Resolved (was: Patch Available) Issue resolved by pull request 1768 [https://github.com/apache/kafka/pull/1768] > Strange behavior during rebalance when turning the OS clock back > ---------------------------------------------------------------- > > Key: KAFKA-4051 > URL: https://issues.apache.org/jira/browse/KAFKA-4051 > Project: Kafka > Issue Type: Bug > Components: consumer > Affects Versions: 0.10.0.0 > Environment: OS: Ubuntu 14.04 - 64bits > Reporter: Gabriel Ibarra > Assignee: Rajini Sivaram > Fix For: 0.10.1.0 > > > If a rebalance is performed after turning the OS clock back, then the kafka > server enters in a loop and the rebalance cannot be completed until the > system returns to the previous date/hour. > Steps to Reproduce: > - Start a consumer for TOPIC_NAME with group id GROUP_NAME. It will be owner > of all the partitions. > - Turn the system (OS) clock back. For instance 1 hour. > - Start a new consumer for TOPIC_NAME using the same group id, it will force > a rebalance. > After these actions the kafka server logs constantly display the messages > below, and after a while both consumers do not receive more packages. This > condition lasts at least the time that the clock went back, for this example > 1 hour, and finally after this time kafka comes back to work. > [2016-08-08 11:30:23,023] INFO [GroupCoordinator 0]: Preparing to restabilize > group GROUP_NAME with old generation 2 (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,025] INFO [GroupCoordinator 0]: Stabilized group > GROUP_NAME generation 3 (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,027] INFO [GroupCoordinator 0]: Preparing to restabilize > group GROUP_NAME with old generation 3 (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,029] INFO [GroupCoordinator 0]: Group GROUP_NAME > generation 3 is dead and removed (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Preparing to restabilize > group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Stabilized group > GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,033] INFO [GroupCoordinator 0]: Preparing to restabilize > group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,034] INFO [GroupCoordinator 0]: Group GROUP generation 1 > is dead and removed (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,043] INFO [GroupCoordinator 0]: Preparing to restabilize > group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Stabilized group > GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Preparing to restabilize > group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,045] INFO [GroupCoordinator 0]: Group GROUP_NAME > generation 1 is dead and removed (kafka.coordinator.GroupCoordinator) > Due to the fact that some systems could have enabled NTP or an administrator > option to change the system clock (date/time) it's important to do it safely, > currently the only way to do it safely is following the next steps: > 1- Tear down the Kafka server. > 2- Change the date/time > 3- Tear up the Kafka server. > But, this approach can be done only if the change was performed by the > administrator, not for NTP. Also in many systems turning down the Kafka > server might cause the INFORMATION TO BE LOST. -- This message was sent by Atlassian JIRA (v6.3.4#6332)