Hi Ismael,

thanks a lot for your for your answer, it was indeed exactly the issue we
had!

We did see the ticket for the issue before, but the steps to reproduce
included a consumer group restart, which we didn't do, so that we thought
we our problem is different.

However, since patching with the fixing commit, everything works fine.

Thanks again! Christiane

On 7 August 2017 at 22:07, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi Christiane,
>
> Thanks for the email. That looks like
> https://issues.apache.org/jira/browse/KAFKA-5600
>
> Ismael
>
> On Mon, Aug 7, 2017 at 7:04 PM, Christiane Lemke <
> christiane.le...@gmail.com
> > wrote:
>
> > Hi all,
> >
> > we are fighting with offset rewinds of seemingly random size and hitting
> > seemingly random partitions on restarting any node in our kafka cluster.
> We
> > are running out of ideas - any help or pointers to things to investigate
> > are highly appreciated.
> >
> > Our kafka setup is dual data center with two local broker clusters (3
> nodes
> > each) and two aggregate broker clusters (5 nodes each), the latter
> running
> > mirror maker to consume messages from the local cluster.
> >
> > Issues seem to have appeared since we upgraded from 0.10.1.0 to 0.11, but
> > not entirely sure it’s related.
> >
> > We first had the theory of too big a consumer offset topic (we use
> > compaction for it) causing the issues on restart, and indeed we found
> that
> > cleaner threads had died after the upgrade. But restarting and cleaning
> > this topic did not help the issue.
> >
> > Logs are pretty silent when it happens, before we cleaned the consumer
> > offset topic, we got a few of these every time it happened, but no longer
> > now:
> >
> > [2017-08-04 11:19:25,970] ERROR [Group Metadata Manager on Broker 472]:
> > Error loading offsets from __consumer_offsets-14
> > (kafka.coordinator.group.GroupMetadataManager)
> > java.lang.IllegalStateException: Unexpected unload of active group
> > tns-ticket-store-b144c9d1-425a-4b90-8310-f6e886741494 while loading
> > partition __consumer_offsets-14
> >         at
> > kafka.coordinator.group.GroupMetadataManager$$anonfun$
> > loadGroupsAndOffsets$6.apply(GroupMetadataManager.scala:600)
> >         at
> > kafka.coordinator.group.GroupMetadataManager$$anonfun$
> > loadGroupsAndOffsets$6.apply(GroupMetadataManager.scala:595)
> >         at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)
> >         at
> > kafka.coordinator.group.GroupMetadataManager.loadGroupsAndOffsets(
> > GroupMetadataManager.scala:595)
> >         at
> > kafka.coordinator.group.GroupMetadataManager.kafka$coordinator$group$
> > GroupMetadataManager$$doLoadGroupsAndOffsets$1(
> GroupMetadataManager.scala:
> > 455)
> >         at
> > kafka.coordinator.group.GroupMetadataManager$$anonfun$
> > loadGroupsForPartition$1.apply$mcV$sp(GroupMetadataManager.scala:441)
> >         at
> > kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(
> > KafkaScheduler.scala:110)
> >         at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
> >         at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> >         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >         at
> > java.util.concurrent.ScheduledThreadPoolExecutor$
> > ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> >         at
> > java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.run(
> > ScheduledThreadPoolExecutor.java:293)
> >         at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1142)
> >         at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:617)
> >         at java.lang.Thread.run(Thread.java:748)
> >
> > Does this seem familiar to anyone? Is there any suggestion as to what to
> > look into closer to investigate this issue? Happy to give more details
> > about anything that might be helpful.
> >
> > Thanks a lot in advance,
> >
> > Christiane
> >
>

Reply via email to