@ Eric: yes I have notices 3GB to 5GB swap uses out of 32GB on servers. And if I will resend the mutations rejected explicitly then this may create a loop for mutations getting rejected again and again. Then how can I handle it? How did you? Am i getting it right? @ Josh: For one of the zookeeper host I was sharing the same drive to store zookeeper data and hadoop datanode. I have changed it to the same drive as others have. I hope this will resolve zookeeper issue. lets see

BTW, here is my zoo.cfg
clientPort=2181
dataDir=/usr/local/zookeeper/data/
syncLimit=5
tickTime=2000
initLimit=10
maxClientCnxn=100
server.1=orkash1:2888:3888
server.2=orkash2:2888:3888
server.3=orkash3:2888:3888

Thanks a lot
Mohit Kaushik


On 12/24/2015 12:47 AM, Josh Elser wrote:
Eric Newton wrote:

Failure to talk to zookeeper is *really* unexpected.

Have you noticed your nodes using any significant swap?

Emphasis on this. Failing to connect to ZooKeeper for 60s (2*30) is a very long time (although, I think I have seen JVM GC pauses longer before).

A couple of generic ZooKeeper questions:

1. Can you share your zoo.cfg?

2. Make sure that ZooKeeper has a "dedicated" drive for it's dataDir. HDFS DataNodes using the same drive as ZooKeeper for its transaction log can cause ZooKeeper to be starved for I/O throughput. A normal "spinning" disk is also better for ZK over SSDs (last I read).

3. Check OS/host level metrics on these ZooKeeper hosts during the times you see these failures.

4. Consider moving your ZooKeeper hosts to "less busy" nodes if you can. You can consider adding more ZooKeeper hosts to the quorum, but keep in mind that this will increase the minimum latency for ZooKeeper operations (as more nodes need to acknowledge updates n/2 + 1)




--
Signature

*Mohit Kaushik*
Software Engineer
A Square,Plot No. 278, Udyog Vihar, Phase 2, Gurgaon 122016, India
*Tel:*+91 (124) 4969352 | *Fax:*+91 (124) 4033553

<http://politicomapper.orkash.com>interactive social intelligence at work...

<https://www.facebook.com/Orkash2012> <http://www.linkedin.com/company/orkash-services-private-limited> <https://twitter.com/Orkash> <http://www.orkash.com/blog/> <http://www.orkash.com>
<http://www.orkash.com> ... ensuring Assurance in complexity and uncertainty

/This message including the attachments, if any, is a confidential business communication. If you are not the intended recipient it may be unlawful for you to read, copy, distribute, disclose or otherwise use the information in this e-mail. If you have received it in error or are not the intended recipient, please destroy it and notify the sender immediately. Thank you /

Reply via email to