I know it sounds silly, but did you check that your test setup works when you don't change the clock?
This pattern can happen when two consumers somehow block each other (for example, one thread with two consumers) - so one waits for the other to join, but the other is blocked, so the first is timed out and then the second is unblocked and manages to join but now the first is blocked and so on... Gwen On Wed, Aug 10, 2016 at 10:29 AM, Gabriel Ibarra <gabriel.iba...@tallertechnologies.com> wrote: > Hello guys, I am dealing with an issue when turn the system clock back > (either due to NTP or administrator action). I'm using kafka_2.11-0.10.0.0 > > I follow the next steps. > - Start a consumer for TOPIC_NAME with group id GROUP_NAME. It will be > owner of all the partitions. > - Turn the system clock back. For instance 1 hour. > - Start a new consumer for TOPIC_NAME using the same group id, it will > force a rebalance. > > After these actions the kafka server logs constantly the below > messages, and after > a while both consumers do not receive more packages. I saw that this > condition lasts at least the time that the clock went back, for this > example 1 hour, and finally after this time kafka come back to work. > > [2016-08-08 11:30:23,023] INFO [GroupCoordinator 0]: Preparing to > restabilize group GROUP_NAME with old generation 2 (kafka.coordinator. > GroupCoordinator) > [2016-08-08 11:30:23,025] INFO [GroupCoordinator 0]: Stabilized group > GROUP_NAME generation 3 (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,027] INFO [GroupCoordinator 0]: Preparing to > restabilize group GROUP_NAME with old generation 3 (kafka.coordinator. > GroupCoordinator) > [2016-08-08 11:30:23,029] INFO [GroupCoordinator 0]: Group GROUP_NAME > generation 3 is dead and removed (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Preparing to > restabilize group GROUP_NAME with old generation 0 (kafka.coordinator. > GroupCoordinator) > [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Stabilized group > GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,033] INFO [GroupCoordinator 0]: Preparing to > restabilize group GROUP_NAME with old generation 1 (kafka.coordinator. > GroupCoordinator) > [2016-08-08 11:30:23,034] INFO [GroupCoordinator 0]: Group GROUP generation > 1 is dead and removed (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,043] INFO [GroupCoordinator 0]: Preparing to > restabilize group GROUP_NAME with old generation 0 (kafka.coordinator. > GroupCoordinator) > [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Stabilized group > GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator) > [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Preparing to > restabilize group GROUP_NAME with old generation 1 (kafka.coordinator. > GroupCoordinator) > [2016-08-08 11:30:23,045] INFO [GroupCoordinator 0]: Group GROUP_NAME > generation 1 is dead and removed (kafka.coordinator.GroupCoordinator) > > IMHO, I think that kafka's consumers have to work fine after any change of > system clock, but maybe this behavior has fundamentals that I don't know. > > I'm sorry if it was discussed previously, I was researching but I didn't > found a similar issue. > > Thanks, > > -- > > > > Gabriel Alejandro Ibarra > > Software Engineer > > San Lorenzo 47, 3rd Floor, Office 5 > > Córdoba, Argentina > > Phone: +54 351 4217888 -- Gwen Shapira Product Manager | Confluent 650.450.2760 | @gwenshap Follow us: Twitter | blog