Consumers always read from the leader replica, which is always in sync by definition. So you are good there. The concern would be if the leader crashes during this period.
On Tue, Oct 21, 2014 at 2:56 PM, Neil Harkins <nhark...@gmail.com> wrote: > Hi. I've got a 5 node cluster running Kafka 0.8.1, > with 4697 partitions (2 replicas each) across 564 topics. > I'm sending it about 1% of our total messaging load now, > and several times a day there is a period where 1~1500 > partitions have one replica not in sync. Is this normal? > If a consumer is reading from a replica that gets deemed > "not in sync", does it get redirected to the good replica? > Is there a #partitions over which maintenance tasks > become infeasible? > > Relevant config bits: > auto.leader.rebalance.enable=true > leader.imbalance.per.broker.percentage=20 > leader.imbalance.check.interval.seconds=30 > replica.lag.time.max.ms=10000 > replica.lag.max.messages=4000 > num.replica.fetchers=4 > replica.fetch.max.bytes=10485760 > > Not necessarily correlated to those periods, > I see a lot of these errors in the logs: > > [2014-10-20 21:23:26,999] 21963614 [ReplicaFetcherThread-3-1] ERROR > kafka.server.ReplicaFetcherThread - [ReplicaFetcherThread-3-1], Error > in fetch Name: FetchRequest; Version: 0; CorrelationId: 77423; > ClientId: ReplicaFetcherThread-3-1; ReplicaId: 2; MaxWait: 500 ms; > MinBytes: 1 bytes; RequestInfo: ... > > And a few of these: > > [2014-10-20 21:23:39,555] 3467527 [kafka-scheduler-2] ERROR > kafka.utils.ZkUtils$ - Conditional update of path > /brokers/topics/foo.bar/partitions/3/state with data > {"controller_epoch":11,"leader":3,"version":1,"leader_epoch":109,"isr":[3]} > and expected version 197 failed due to > org.apache.zookeeper.KeeperException$BadVersionException: > KeeperErrorCode = BadVersion for > /brokers/topics/foo.bar/partitions/3/state > > And this one I assume is a client closing the connection non-gracefully, > thus should probably be a warning, not an error?: > > [2014-10-20 21:54:15,599] 23812214 [kafka-processor-9092-3] ERROR > kafka.network.Processor - Closing socket for /10.31.0.224 because of > error > > -neil