Hi Cheng, Are all machines configured with NTP and all clocks in sync? If that is not the case do it.
If your clocks are not in sync it causes some weird issues like the ones you see, but also schema disagreements and in some cases corrupted data. Regards, Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo <http://linkedin.com/in/carlosjuzarterolo>* Tel: 1649 www.pythian.com On Tue, Feb 10, 2015 at 3:40 AM, Cheng Ren <cheng....@bloomreach.com> wrote: > Hi, > We have a two-dc cluster with 21 nodes and 27 nodes in each DC. Over the > past few months, we have seen nodetool status marks 4-8 nodes down while > they are actually functioning. Particularly today we noticed that running > nodetool status on some nodes shows higher number of nodes are down than > before while they are actually up and serving requests. > For example, on one node it shows 42 nodes are down. > > phi_convict_threshold of all nodes are set as 12, and we are running > cassandra 2.0.4 on AWS EC2 machines. > > Does anyone have recommendation on identifying the root cause of this? > Will this cause any consequences? > > Thanks, > Cheng > -- --