Do you see schemat in sync? Nodetool describecluster. Check system log for any corruption.
Regards, Nitan Cell: 510 449 9629 > On Jul 19, 2019, at 12:32 PM, ZAIDI, ASAD A <az1...@att.com> wrote: > > “aws asked to set nvme_timeout to higher number in etc/grub.conf.” > > Did you ask AWS if setting higher value is real solution to bug - Is there > not any patch available to address the bug? - just curios to know > > From: Rahul Reddy [mailto:rahulreddy1...@gmail.com] > Sent: Friday, July 19, 2019 10:49 AM > To: user@cassandra.apache.org > Subject: Rebooting one Cassandra node caused all the application nodes go down > > Here , > > We have 6 nodes each in 2 data centers us-east-1 and us-west-2 . We have RF > 3 and cl set to local quorum. And gossip snitch. All our instance are > c5.2xlarge and data files and comit logs are stored in gp2 ebs. C5 instance > type had a bug which aws asked to set nvme_timeout to higher number in > etc/grub.conf. after setting the parameter and did run nodetool drain and > reboot the node in east > > Instance cameup but Cassandra didn't come up normal had to start the > Cassandra. Cassandra cameup but it shows other instances down. Even though > didn't reboot the other node down same was observed in one other node. How > could that happen and don't any errors in system.log which is set to info. > Without any intervention gossip settled in 10 mins entire cluster became > normal. > > Tried same thing West it happened again > > > > I'm concerned how to check what caused it and if a reboot happens again how > to avoid this. > If I just STOP Cassandra instead of reboot I don't see this issue. >