[
https://issues.apache.org/jira/browse/KAFKA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427840#comment-15427840
]
Rajini Sivaram commented on KAFKA-4051:
---------------------------------------
[~ijuma] [~gwenshap] Thank you both for the feedback. I will try it out
locally, and run performance tests. I can test on Linux and Mac.
> Strange behavior during rebalance when turning the OS clock back
> ----------------------------------------------------------------
>
> Key: KAFKA-4051
> URL: https://issues.apache.org/jira/browse/KAFKA-4051
> Project: Kafka
> Issue Type: Bug
> Components: consumer
> Affects Versions: 0.10.0.0
> Environment: OS: Ubuntu 14.04 - 64bits
> Reporter: Gabriel Ibarra
> Assignee: Rajini Sivaram
>
> If a rebalance is performed after turning the OS clock back, then the kafka
> server enters in a loop and the rebalance cannot be completed until the
> system returns to the previous date/hour.
> Steps to Reproduce:
> - Start a consumer for TOPIC_NAME with group id GROUP_NAME. It will be owner
> of all the partitions.
> - Turn the system (OS) clock back. For instance 1 hour.
> - Start a new consumer for TOPIC_NAME using the same group id, it will force
> a rebalance.
> After these actions the kafka server logs constantly display the messages
> below, and after a while both consumers do not receive more packages. This
> condition lasts at least the time that the clock went back, for this example
> 1 hour, and finally after this time kafka comes back to work.
> [2016-08-08 11:30:23,023] INFO [GroupCoordinator 0]: Preparing to restabilize
> group GROUP_NAME with old generation 2 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,025] INFO [GroupCoordinator 0]: Stabilized group
> GROUP_NAME generation 3 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,027] INFO [GroupCoordinator 0]: Preparing to restabilize
> group GROUP_NAME with old generation 3 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,029] INFO [GroupCoordinator 0]: Group GROUP_NAME
> generation 3 is dead and removed (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Preparing to restabilize
> group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Stabilized group
> GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,033] INFO [GroupCoordinator 0]: Preparing to restabilize
> group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,034] INFO [GroupCoordinator 0]: Group GROUP generation 1
> is dead and removed (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,043] INFO [GroupCoordinator 0]: Preparing to restabilize
> group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Stabilized group
> GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Preparing to restabilize
> group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,045] INFO [GroupCoordinator 0]: Group GROUP_NAME
> generation 1 is dead and removed (kafka.coordinator.GroupCoordinator)
> Due to the fact that some systems could have enabled NTP or an administrator
> option to change the system clock (date/time) it's important to do it safely,
> currently the only way to do it safely is following the next steps:
> 1- Tear down the Kafka server.
> 2- Change the date/time
> 3- Tear up the Kafka server.
> But, this approach can be done only if the change was performed by the
> administrator, not for NTP. Also in many systems turning down the Kafka
> server might cause the INFORMATION TO BE LOST.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)