Re: Rebooting one Cassandra node caused all the application nodes go down

Jeff Jirsa Fri, 19 Jul 2019 10:14:21 -0700

Could be something like
https://issues.apache.org/jira/browse/CASSANDRA-14358


Hard to say after the fact.


On Fri, Jul 19, 2019 at 8:49 AM Rahul Reddy <rahulreddy1...@gmail.com>
wrote:

> Here ,
>
> We have 6 nodes each in 2 data centers us-east-1 and us-west-2  . We have
> RF 3 and  cl set to local quorum. And gossip snitch. All our instance are
> c5.2xlarge and data files and comit logs are stored in gp2 ebs.  C5
> instance type had a bug which aws asked to set nvme_timeout to higher
> number in etc/grub.conf. after setting the parameter and did run nodetool
> drain and reboot the node in east
>
> Instance cameup but Cassandra didn't come up normal had to start the
> Cassandra. Cassandra cameup but it shows other instances down. Even though
> didn't reboot the other node down same was observed in one other node. How
> could that happen and don't any errors in system.log which is set to info.
> Without any intervention gossip settled in 10 mins entire cluster became
> normal.
>
> Tried same thing West it happened again
>
>
>
> I'm concerned how to check what caused it and if a reboot happens again
> how to avoid this.
>  If I just  STOP Cassandra instead of reboot I don't see this issue.
>
>

Re: Rebooting one Cassandra node caused all the application nodes go down

Reply via email to