Hey Stephen, two things on that.
1) You need to figure out what is the root cause making the leader election
occur. Could be the brokers are having ZK timeouts and leader election is
occurring as result... if so you need to dig into why (look at all your
logs... You should look for some type of fl
i find this situation occurs frequently in my setup - only takes one day -
and blam - the leader board is all skewed to a single one. not really sure
to overcome that once it happens so if there is a solution out there i'd be
interested.
On Fri, Sep 12, 2014 at 12:50 PM, Cory Watson wrote:
> Wh
What follows is a guess on my part, but here's what I *think* was happening:
We hit an OOM that seems to've killed some of the replica fetcher threads.
I had a mishmash of replicas that weren't making progress as determined by
the JMX stats for the replica. The thread for which the JMX attribute w
We're seeing the same behaviour today on our cluster. It is not like a
single broker went out of the cluster, rather a few partitions seem lazy on
every broker.
On Fri, Sep 12, 2014 at 9:31 PM, Cory Watson wrote:
> I noticed this morning that a few of our partitions do not have their full
> comp
I noticed this morning that a few of our partitions do not have their full
complement of ISRs:
Topic:migration PartitionCount:16 ReplicationFactor:3
Configs:retention.bytes=32985348833280
Topic: migration Partition: 0 Leader: 1 Replicas: 1,4,5 Isr: 1,5,4
Topic: migration Partition: 1 Leader: 1 Rep