Could you take a thread dump on that broker and send it across? One of the possibilities is the replica fetcher thread is somehow dead.
Thanks, Neha On Aug 21, 2013 8:00 AM, "Yu, Libo" <libo...@citi.com> wrote: > I checked the log of normal restart. The replication manager should start > to handle > leader and isr request after the server is up. What may stop it from doing > that? > Is it because of missing mx4j-tools.jar? > > Regards, > > Libo > > From: Yu, Libo [ICG-IT] > Sent: Wednesday, August 21, 2013 10:51 AM > To: 'users@kafka.apache.org' > Subject: broker never comes back to ISR > > Hi team, > > We have three kafka brokers in a production cluster. We use replication > factor 3 for all topics. > We notice quite frequently one broker is not in isr. Sometimes after it is > restarted, it > will go back to isr. Sometimes even after it is restarted, it will not go > back to isr. > > In today's case, after a broker is restarted, this is what we found from > the log: > > [2013-08-21 08:22:55,524] INFO [Kafka Server 2], started > (kafka.server.KafkaServer) > [2013-08-21 08:25:06,621] INFO Closing socket connection to /xxx.xx.xx.xx. > (kafka.network.Processor) > [2013-08-21 08:25:06,716] INFO Closing socket connection to / > xxx.xx.xx.xx. (kafka.network.Processor) > [2013-08-21 08:27:19,824] INFO Closing socket connection to / > xxx.xx.xx.xx. (kafka.network.Processor) > [2013-08-21 08:28:16,711] INFO Closing socket connection to / > xxx.xx.xx.xx. (kafka.network.Processor) > [2013-08-21 08:28:17,978] INFO Closing socket connection to / > xxx.xx.xx.xx. (kafka.network.Processor) > ... > Numerous "Closing socket connection" and nothing else. > > Any guidance will be appreciated. > > Regards, > > Libo > >