I was doing a rolling bounce of all brokers. Immediately after the bad
broker was bounced, those stuck producers recovered
On Fri, Sep 11, 2015 at 9:05 AM, Mayuresh Gharat wrote:
> So how did you detect that the broker is bad? If bouncing brokers solved
> the problem and you did not find any unu
So how did you detect that the broker is bad? If bouncing brokers solved
the problem and you did not find any unusual things in the logs on brokers
, it is likely that the process was up but was isolated from producer
request and since the producer did not have timeout the producer buffer
filled up
frankly I don't know exactly what went BAD for that broker. process is
still UP.
On Wed, Sep 9, 2015 at 10:10 AM, Mayuresh Gharat wrote:
> 1) any suggestion on how to identify the bad broker(s)?
> ---> At Linkedin we have alerts that are setup using our internal scripts
> for detecting if a brok
1) any suggestion on how to identify the bad broker(s)?
---> At Linkedin we have alerts that are setup using our internal scripts
for detecting if a broker has gone bad. We also check the under replicated
partitions and that can tell us which broker has gone bad. By broker going
bad, it can mean di
We have observed that some producer instances stopped sending traffic to
brokers, because the memory buffer is full. those producers got stuck in
this state permanently. Because we couldn't find out which broker is bad
here. So I did a rolling restart the all brokers. after the bad broker got
bounc