Mahesh, Thanks for sharing the info. Is having "Exactly" 8 brokers a "Must" for you? because one of them is technically unnecessary since your cluster can only tolerate 3 failures (even with 7 brokers). Could you please try the following:
1) Stop the cluster. 2) Increase the number of renum.recovery.threads.per.data.dir from 1 to 5 (i mean something bigger than 1). 3) Retry creating a topic with both partitions and replication-factor set to 8. Let us know how it goes. Regards, On 31 July 2017 at 18:44, Mahesh Patade <mahesh.pat...@nw18.com> wrote: > Hi All, > We are having 8 broker kafka cluster configured in our setup and created a > topic with 8 partitions & 3 replicas. While trying to consume data from one > broker(id:6) we are getting below errors and increase in lag for active > partition on that host. We even tried restarting, deleting logs and > reinstalling broker but still no luck. This is happening only for one > broker. If we shut down that broker service everything works well. > Kafka Server Logs: > [2017-07-31 19:32:39,212] INFO [Group Metadata Manager on Broker 6]: > Finished loading offsets from __consumer_offsets-5 in 5 milliseconds. > (kafka.coordinator.GroupMe > tadataManager) > [2017-07-31 19:32:39,217] INFO [ReplicaFetcherManager on broker 6] Removed > fetcher for partitions __consumer_offsets-13 (kafka.server. > ReplicaFetcherManager) > [2017-07-31 19:32:39,218] INFO [Group Metadata Manager on Broker 6]: > Loading offsets and group metadata from __consumer_offsets-13 > (kafka.coordinator.GroupMetadataMa > nager) > [2017-07-31 19:32:39,220] INFO [ReplicaFetcherManager on broker 6] Removed > fetcher for partitions __consumer_offsets-37 (kafka.server. > ReplicaFetcherManager) > [2017-07-31 19:32:39,230] INFO [ReplicaFetcherManager on broker 6] Removed > fetcher for partitions __consumer_offsets-45 (kafka.server. > ReplicaFetcherManager) > [2017-07-31 19:32:39,232] INFO [Group Metadata Manager on Broker 6]: > Finished loading offsets from __consumer_offsets-13 in 13 milliseconds. > (kafka.coordinator.Group > MetadataManager) > [2017-07-31 19:32:39,232] INFO [Group Metadata Manager on Broker 6]: > Loading offsets and group metadata from __consumer_offsets-37 > (kafka.coordinator.GroupMetadataMa > nager) > [2017-07-31 19:32:39,248] INFO [ReplicaFetcherManager on broker 6] Removed > fetcher for partitions bseEquityAggFeedProd-4 (kafka.server. > ReplicaFetcherManager) > [2017-07-31 19:32:39,249] INFO Truncating log bseEquityAggFeedProd-4 to > offset 13082598. (kafka.log.Log) > [2017-07-31 19:32:39,276] INFO [ReplicaFetcherManager on broker 6] Added > fetcher for partitions List([bseEquityAggFeedProd-4, initOffset 13082598 to > broker BrokerEnd > Point(2,172.29.51.52,9092)] ) (kafka.server.ReplicaFetcherManager) > [2017-07-31 19:32:39,277] INFO [ReplicaFetcherThread-0-2], Starting > (kafka.server.ReplicaFetcherThread) > [2017-07-31 19:32:39,294] INFO [ReplicaFetcherManager on broker 6] Removed > fetcher for partitions __consumer_offsets-17 (kafka.server. > ReplicaFetcherManager) > [2017-07-31 19:32:39,294] INFO Truncating log __consumer_offsets-17 to > offset 0. (kafka.log.Log) > [2017-07-31 19:32:39,300] INFO [ReplicaFetcherManager on broker 6] Added > fetcher for partitions List([__consumer_offsets-17, initOffset 0 to broker > BrokerEndPoint(2, > 172.29.51.52,9092)] ) (kafka.server.ReplicaFetcherManager) > [2017-07-31 19:32:39,304] ERROR [ReplicaFetcherThread-0-5], Error for > partition [__consumer_offsets,9] to broker 5:org.apache.kafka.common. > errors.NotLeaderForPartiti > onException: This server is not the leader for that topic-partition. > (kafka.server.ReplicaFetcherThread) > [2017-07-31 19:32:39,308] ERROR [ReplicaFetcherThread-0-2], Error for > partition [bseEquityAggFeedProd,4] to broker 2:org.apache.kafka.common. > errors.NotLeaderForParti > tionException: This server is not the leader for that topic-partition. > (kafka.server.ReplicaFetcherThread) > [2017-07-31 19:32:39,313] ERROR [ReplicaFetcherThread-0-2], Error for > partition [__consumer_offsets,17] to broker 2:org.apache.kafka.common. > errors.NotLeaderForPartit > ionException: This server is not the leader for that topic-partition. > (kafka.server.ReplicaFetcherThread) > [2017-07-31 19:32:39,318] INFO [ReplicaFetcherManager on broker 6] Removed > fetcher for partitions __consumer_offsets-9 (kafka.server. > ReplicaFetcherManager) > [2017-07-31 19:32:39,319] INFO Truncating log __consumer_offsets-9 to > offset 392. (kafka.log.Log) > [2017-07-31 19:32:39,324] INFO [ReplicaFetcherManager on broker 6] Added > fetcher for partitions List([__consumer_offsets-9, initOffset 392 to broker > BrokerEndPoint(2 > ,172.29.51.52,9092)] ) (kafka.server.ReplicaFetcherManager) > [2017-07-31 19:32:39,339] ERROR [ReplicaFetcherThread-0-2], Error for > partition [__consumer_offsets,9] to broker 2:org.apache.kafka.common. > errors.NotLeaderForPartiti > onException: This server is not the leader for that topic-partition. > (kafka.server.ReplicaFetcherThread) > [2017-07-31 19:32:39,986] INFO [GroupCoordinator 6]: Loading group > metadata for bseEquityLegacyBod with generation 4 (kafka.coordinator. > GroupCoordinator) > [2017-07-31 19:32:39,991] INFO [Group Metadata Manager on Broker 6]: > Finished loading offsets from __consumer_offsets-37 in 759 milliseconds. > (kafka.coordinator.Grou > pMetadataManager) > > Consumer Logs: > > 17/07/31/19:26:20.005 2-thread-3 INFO nals.AbstractCoordinator Discovered > coordinator 172.XX.51.56:9092 (id: 2147483641 rack: null) for group > bseEquityLegacyBod. > 17/07/31/19:26:20.005 2-thread-3 INFO nals.AbstractCoordinator Marking > the coordinator 172.XX.51.56:9092 (id: 2147483641 rack: null) dead for > group bseEquityLegacyBod > 17/07/31/19:26:20.106 2-thread-3 INFO nals.AbstractCoordinator Discovered > coordinator 172.XX.51.56:9092 (id: 2147483641 rack: null) for group > bseEquityLegacyBod. > 17/07/31/19:26:20.106 2-thread-3 INFO nals.AbstractCoordinator Marking > the coordinator 172.XX.51.56:9092 (id: 2147483641 rack: null) dead for > group bseEquityLegacyBod > > Thanks in advance. > > Regards, > Mahesh Patade > > > ________________________________ > > DISCLAIMER AND PRIVILEGE NOTICE : This e-mail and any files transmitted > with it contain confidential, copyright, proprietary and legally privileged > information. It should not be used by anyone who is not the original > intended recipient. Any use, distribution, copying or disclosure by any > other person is strictly prohibited. If you receive this transmission in > error, please notify the sender by reply email and then destroy the > message. Opinions, conclusions and other information in this message that > do not relate to official business of Network18 Media & Investments Ltd., > its subsidiaries, holding companies i.e. Network18 group shall be > understood to be neither given nor endorsed by Network18 group. Internet > communications cannot be guaranteed to be timely, Secure, error or > virus-free. The sender does not accept liability for any errors or > omissions. >