I observed below error in one of the broker , and it is unresponsive ... 2018-04-07 12:51:39,830] ERROR [Replica Manager on Broker 3]: Error processing append operation on partition __consumer_offsets-27 (kafka.server.ReplicaManager)
org.apache.kafka.common.errors.NotEnoughReplicasException: Number of insync replicas for partition __consumer_offsets-27 is [1], below required minimum [2] First 24 hours cluster works well under ~60K messages/sec inbound& outbound load after a day broker is unresponsive and Group Coordinator started thrwoing "Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: The coordinator is not available". ./bin/kafka-topics.sh --zookeeper localhost:2181:/kafka --describe --topic __consumer_offsets Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer Topic: __consumer_offsets Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3 Topic: __consumer_offsets Partition: 1 Leader: 1 Replicas: 1,2,3 Isr: 1 Initially concumer_offsets topic replicas was in sync. Kafka Version : 0.11.0. On Fri, Aug 25, 2017 at 6:04 PM, Murad Mamedov <m...@muradm.net> wrote: > At the time of first time it occurred, all replicas was in sync. > But after restart of clients and brokers, exception started to occur > immediately, and replicas becoming out of sync. > As explained in the issue, bug related to configuration and timing of > records. > > On Fri, Aug 25, 2017 at 10:31 AM, Dan Markhasin <minimi...@gmail.com> > wrote: > > > If you run kafka-topics.sh --describe --topic __consumer_offsets, does it > > show that all replicas are in sync? > > > > On 23 August 2017 at 23:11, Murad Mamedov <m...@muradm.net> wrote: > > > > > Hi David, > > > > > > Thanks for reply. However, I don't have problem with number of > replicas. > > I > > > have 3 brokers. And topics configured accordingly, especially > > > __consumer_offsets > > > > > > Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3 > > > Configs:segment.bytes=104857600,cleanup.policy= > compact,compression.type= > > > producer > > > > > > And everything was working find for months, until today. > > > > > > Why would I want changing replication factor? To what value? > > > > > > On Wed, Aug 23, 2017 at 11:19 PM, David Frederick < > > > david.freder...@gmail.com > > > > wrote: > > > > > > > |> NotEnoughReplicasException: Number of insync replicas for > partition > > > > __consumer_offsets-17 is [1], below required minimum [2] > > > > > > > > Please refer to > > > > https://stackoverflow.com/questions/37960767/how-to- > > > > change-the-replicas-of-kafka-topic. > > > > Hope it helps! > > > > > > > > > > > > On Aug 23, 2017 5:17 AM, "Murad Mamedov" <m...@muradm.net> wrote: > > > > > > > > > Hi, > > > > > > > > > > Did you manage to find the root cause of this issue? > > > > > > > > > > Same thing happened here. > > > > > > > > > > Thanks in advance > > > > > > > > > > On Tue, Jun 13, 2017 at 7:50 PM, Paul van der Linden < > > > p...@sportr.co.uk> > > > > > wrote: > > > > > > > > > > > I managed to solve it by: > > > > > > - stopping and deleting all data on kafka & zookeeper > > > > > > - stopping all consumers and producers > > > > > > - starting kafka & zookeeper, waiting till they are up > > > > > > - start all consumers & producers, > > > > > > > > > > > > Is there a better way to do this, without data loss and halting > > > > > everything? > > > > > > > > > > > > On Tue, Jun 13, 2017 at 4:28 PM, Paul van der Linden < > > > > p...@sportr.co.uk> > > > > > > wrote: > > > > > > > > > > > > > A few lines of the logs: > > > > > > > > > > > > > > [2017-06-13 15:25:37,343] INFO [GroupCoordinator 0]: Stabilized > > > group > > > > > > > summarizer generation 701 (kafka.coordinator.GroupCoordinator) > > > > > > > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Assignment > > > > > received > > > > > > > from leader for group summarizer for generation 701 > > > > (kafka.coordinator. > > > > > > > GroupCoordinator) > > > > > > > [2017-06-13 15:25:37,345] ERROR [Replica Manager on Broker 0]: > > > Error > > > > > > > processing append operation on partition __consumer_offsets-17 > > > > > > > (kafka.server.ReplicaManager) > > > > > > > org.apache.kafka.common.errors.NotEnoughReplicasException: > > Number > > > of > > > > > > > insync replicas for partition __consumer_offsets-17 is [1], > below > > > > > > required > > > > > > > minimum [2] > > > > > > > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Preparing > to > > > > > > > restabilize group summarizer with old generation 701 > > > > > (kafka.coordinator. > > > > > > > GroupCoordinator) > > > > > > > > > > > > > > This keeps happening, for all consumer offsets and all groups, > > etc > > > > > > > > > > > > > > On Tue, Jun 13, 2017 at 4:21 PM, Paul van der Linden < > > > > > p...@sportr.co.uk> > > > > > > > wrote: > > > > > > > > > > > > > >> Hi, > > > > > > >> > > > > > > >> I'm trying to find out how to at least get my kafka working > > again. > > > > > > >> Something went wrong and kafka has halted to a throughput of 0 > > > > > > messages. It > > > > > > >> keeps looping on stablizing consumer groups, and erroring on > an > > > > append > > > > > > >> operation to the offset paritions, plus Not enough replicas. > > > > > > >> > > > > > > >> The weird things is, that after not being able to work this > out > > I > > > > want > > > > > > >> pretty brutal (luckily I can afford to loose more messages): > > > > > > >> - delete all kafka and zookeeper instances > > > > > > >> - updated kafka > > > > > > >> - cleared all disk > > > > > > >> > > > > > > >> Still kafka is in this unrecoverable error. Does anyone have > any > > > > idea > > > > > > how > > > > > > >> to fix this? > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Regards, > > > > > *Murad M* > > > > > *M (tr): +90 (533) 4874329* > > > > > *E: m...@muradm.net <m...@muradm.net>* > > > > > > > > > > > > > > > > > > > > > -- > > > Regards, > > > *Murad M* > > > *M (tr): +90 (533) 4874329* > > > *E: m...@muradm.net <m...@muradm.net>* > > > > > > > > > -- > Regards, > *Murad M* > *M (tr): +90 (533) 4874329* > *E: m...@muradm.net <m...@muradm.net>* >