Schema matches and corruption errors in system.log On Fri, Jul 19, 2019, 1:33 PM Nitan Kainth <nitankai...@gmail.com> wrote:
> Do you see schemat in sync? Nodetool describecluster. > > Check system log for any corruption. > > > Regards, > > Nitan > > Cell: 510 449 9629 > > On Jul 19, 2019, at 12:32 PM, ZAIDI, ASAD A <az1...@att.com> wrote: > > “aws asked to set nvme_timeout to higher number in etc/grub.conf.” > > > > Did you ask AWS if setting higher value is real solution to bug - Is there > not any patch available to address the bug? - just curios to know > > > > *From:* Rahul Reddy [mailto:rahulreddy1...@gmail.com > <rahulreddy1...@gmail.com>] > *Sent:* Friday, July 19, 2019 10:49 AM > *To:* user@cassandra.apache.org > *Subject:* Rebooting one Cassandra node caused all the application nodes > go down > > > > Here , > > > > We have 6 nodes each in 2 data centers us-east-1 and us-west-2 . We have > RF 3 and cl set to local quorum. And gossip snitch. All our instance are > c5.2xlarge and data files and comit logs are stored in gp2 ebs. C5 > instance type had a bug which aws asked to set nvme_timeout to higher > number in etc/grub.conf. after setting the parameter and did run nodetool > drain and reboot the node in east > > > > Instance cameup but Cassandra didn't come up normal had to start the > Cassandra. Cassandra cameup but it shows other instances down. Even though > didn't reboot the other node down same was observed in one other node. How > could that happen and don't any errors in system.log which is set to info. > > Without any intervention gossip settled in 10 mins entire cluster became > normal. > > > > Tried same thing West it happened again > > > > > > > > I'm concerned how to check what caused it and if a reboot happens again > how to avoid this. > > If I just STOP Cassandra instead of reboot I don't see this issue. > > > >