Hi Apurva, The first error vanished after I restarted all the brokers. I haven't seen these recursive errors and my thought is since we restarted zookeeper nodes we might have put all the brokers in some sort of a iffy state
The broker occasionally being hung has plagued us quite a bit. Our Kafka nodes and Zookeeper nodes are all on EC2 instances. Kafka nodes communicate with Zookeeper nodes within the same VPC but they go a load balancer E.g. Kafka node --> Internal Load Balancer specific to a ZK Node --> ZK Node. This allows us to bring down ZK nodes and spin up new ones w/o having to change Kafka configuration. I am not sure this could cause an issue. I haven't seen any specific ZK errors on /var/log/kafka/server.log Thanks, Shailesh On Wed, Dec 14, 2016 at 2:49 PM, Apurva Mehta <apu...@confluent.io> wrote: > Regarding 1), you can see a NotLeaderForPartition exception if the leader > for the partition has moved to another host but the client metadata has not > updated itself yet. The messages should disappear once the metadata is > updated on all clients. > > Leaders may move if brokers are bounced, or if they have connectivity > issues with zookeeper. Looking at your second point, it seems like > connectivity may be a problem. Where is zookeeper running? do your brokers > have a solid link to that machine? Do you see any zookeeper connection > errors in your broker logs? > > On Tue, Dec 13, 2016 at 6:41 PM, Shailesh Hemdev < > shailesh.hem...@foresee.com> wrote: > > > We are using a 3 node Kafka cluster and are encountering some weird > issues. > > > > 1) On Each node, when we tail the server.log file under /var/log/kafka we > > see continuous errors like these > > > > pic-partition. (kafka.server.ReplicaFetcherThread) > > [2016-12-14 02:39:30,747] ERROR [ReplicaFetcherThread-0-4410000], Error > > for > > partition [dev-core-platform-logging,15] to broker > > 4410000:org.apache.kafka.common.errors.NotLeaderForPartitionException: > > This > > server is not the leader for that topic-partition. > > (kafka.server.ReplicaFetcherThread) > > > > The broker is up and is showing under zookeeper. So it is not clear why > > these errors occur > > > > 2) Occasionally we will find a Kafka broker that goes down. We have > > adjusted the Ulimit to increase open files as well as added 6g to the > heap. > > When the broker goes down, the process is itself up but is de registered > > from Zookeeper > > > > Thanks, > > > > *Shailesh * > > > > -- > > > > > > This email communication (including any attachments) contains information > > from Answers Corporation or its affiliates that is confidential and may > be > > privileged. The information contained herein is intended only for the use > > of the addressee(s) named above. If you are not the intended recipient > (or > > the agent responsible to deliver it to the intended recipient), you are > > hereby notified that any dissemination, distribution, use, or copying of > > this communication is strictly prohibited. If you have received this > email > > in error, please immediately reply to sender, delete the message and > > destroy all copies of it. If you have questions, please email > > le...@answers.com. > > > > If you wish to unsubscribe to commercial emails from Answers and its > > affiliates, please go to the Answers Subscription Center > > http://campaigns.answers.com/subscriptions to opt out. Thank you. > > > -- *Shailesh Hemdev* Manager, Software Engineering shailesh.hem...@foresee.com p (734) 352-6247 <https://t.xink.io/Tracking/Index/6qgAAGJcAACj1gkA0> -- This email communication (including any attachments) contains information from Answers Corporation or its affiliates that is confidential and may be privileged. The information contained herein is intended only for the use of the addressee(s) named above. If you are not the intended recipient (or the agent responsible to deliver it to the intended recipient), you are hereby notified that any dissemination, distribution, use, or copying of this communication is strictly prohibited. If you have received this email in error, please immediately reply to sender, delete the message and destroy all copies of it. If you have questions, please email le...@answers.com. If you wish to unsubscribe to commercial emails from Answers and its affiliates, please go to the Answers Subscription Center http://campaigns.answers.com/subscriptions to opt out. Thank you.