Thanks Ismael, I agree with you, It seems to be a problem related with absolute timers.
So, How we continue?, do you agree with report this as a bug? In our system this issue has a great impact. And maybe this particular issue could be fixed without a serious decreasing of performance. On Thu, Aug 11, 2016 at 11:11 AM, Ismael Juma <ism...@juma.me.uk> wrote: > Kafka code uses System.currentTimeMillis in a number of places, so it would > not surprise me if it misbehaves when the clock is turned back by an hour. > System.nanoTime is meant to handle this issue, but there are questions > about the performance impact of using that ( > https://github.com/apache/kafka/pull/837). > > Ismael > > On Thu, Aug 11, 2016 at 2:19 PM, Gabriel Ibarra < > gabriel.iba...@tallertechnologies.com> wrote: > > > Thanks for answering, all help is welcome. > > > > Yes, I tested without changing the clock and It works well. > > Actually both consumer are running in different process, > > so I think it is not the case that you mention. > > > > I even tested this using two different Kafka clients, > > using the java client and using librdkafka of edenhill (a c client), > > and I got the same results. > > That is why I think that the problem come from Kafka. > > > > Gabriel > > > > > > On Thu, Aug 11, 2016 at 2:20 AM, Gwen Shapira <g...@confluent.io> wrote: > > > > > I know it sounds silly, but did you check that your test setup works > > > when you don't change the clock? > > > > > > This pattern can happen when two consumers somehow block each other > > > (for example, one thread with two consumers) - so one waits for the > > > other to join, but the other is blocked, so the first is timed out and > > > then the second is unblocked and manages to join but now the first is > > > blocked and so on... > > > > > > Gwen > > > > > > On Wed, Aug 10, 2016 at 10:29 AM, Gabriel Ibarra > > > <gabriel.iba...@tallertechnologies.com> wrote: > > > > Hello guys, I am dealing with an issue when turn the system clock > back > > > > (either due to NTP or administrator action). I'm using > > > kafka_2.11-0.10.0.0 > > > > > > > > I follow the next steps. > > > > - Start a consumer for TOPIC_NAME with group id GROUP_NAME. It will > be > > > > owner of all the partitions. > > > > - Turn the system clock back. For instance 1 hour. > > > > - Start a new consumer for TOPIC_NAME using the same group id, it > will > > > > force a rebalance. > > > > > > > > After these actions the kafka server logs constantly the below > > > > messages, and after > > > > a while both consumers do not receive more packages. I saw that this > > > > condition lasts at least the time that the clock went back, for this > > > > example 1 hour, and finally after this time kafka come back to work. > > > > > > > > [2016-08-08 11:30:23,023] INFO [GroupCoordinator 0]: Preparing to > > > > restabilize group GROUP_NAME with old generation 2 > (kafka.coordinator. > > > > GroupCoordinator) > > > > [2016-08-08 11:30:23,025] INFO [GroupCoordinator 0]: Stabilized group > > > > GROUP_NAME generation 3 (kafka.coordinator.GroupCoordinator) > > > > [2016-08-08 11:30:23,027] INFO [GroupCoordinator 0]: Preparing to > > > > restabilize group GROUP_NAME with old generation 3 > (kafka.coordinator. > > > > GroupCoordinator) > > > > [2016-08-08 11:30:23,029] INFO [GroupCoordinator 0]: Group GROUP_NAME > > > > generation 3 is dead and removed (kafka.coordinator. > GroupCoordinator) > > > > [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Preparing to > > > > restabilize group GROUP_NAME with old generation 0 > (kafka.coordinator. > > > > GroupCoordinator) > > > > [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Stabilized group > > > > GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator) > > > > [2016-08-08 11:30:23,033] INFO [GroupCoordinator 0]: Preparing to > > > > restabilize group GROUP_NAME with old generation 1 > (kafka.coordinator. > > > > GroupCoordinator) > > > > [2016-08-08 11:30:23,034] INFO [GroupCoordinator 0]: Group GROUP > > > generation > > > > 1 is dead and removed (kafka.coordinator.GroupCoordinator) > > > > [2016-08-08 11:30:23,043] INFO [GroupCoordinator 0]: Preparing to > > > > restabilize group GROUP_NAME with old generation 0 > (kafka.coordinator. > > > > GroupCoordinator) > > > > [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Stabilized group > > > > GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator) > > > > [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Preparing to > > > > restabilize group GROUP_NAME with old generation 1 > (kafka.coordinator. > > > > GroupCoordinator) > > > > [2016-08-08 11:30:23,045] INFO [GroupCoordinator 0]: Group GROUP_NAME > > > > generation 1 is dead and removed (kafka.coordinator. > GroupCoordinator) > > > > > > > > IMHO, I think that kafka's consumers have to work fine after any > change > > > of > > > > system clock, but maybe this behavior has fundamentals that I don't > > know. > > > > > > > > I'm sorry if it was discussed previously, I was researching but I > > didn't > > > > found a similar issue. > > > > > > > > Thanks, > > > > > > > > -- > > > > > > > > > > > > > > > > Gabriel Alejandro Ibarra > > > > > > > > Software Engineer > > > > > > > > San Lorenzo 47, 3rd Floor, Office 5 > > > > > > > > Córdoba, Argentina > > > > > > > > Phone: +54 351 4217888 > > > > > > > > > > > > -- > > > Gwen Shapira > > > Product Manager | Confluent > > > 650.450.2760 | @gwenshap > > > Follow us: Twitter | blog > > > > > > > > > > > -- > > > > > > > > Gabriel Alejandro Ibarra > > > > Software Engineer > > > > San Lorenzo 47, 3rd Floor, Office 5 > > > > Córdoba, Argentina > > > > Phone: +54 351 4217888 > > > -- Gabriel Alejandro Ibarra Software Engineer San Lorenzo 47, 3rd Floor, Office 5 Córdoba, Argentina Phone: +54 351 4217888