2.1 cassandra 1 node down produces replica shortfall

Carl Mueller Fri, 17 May 2019 12:41:10 -0700

Being one of our largest and unfortunately heaviest multi-tenant clusters,
and our last 2.1 prod cluster, we are encountering not enough replica
errors (need 2, only found 1) after only bringing down 1 node. 90 node
cluster, 30/dc, dcs are in europe, asia, and us. AWS.


Are there bugs for erroneous gossip state in 2.1.9? I know system.peers and
other issues can make gossip state detection a bit iffy, and AWS also
introduces uncertainty.

Java-driver is v3.7. It is primarily one app throwing the errors, but this
is the app without caching but with substantive query volume. It is RF3
also, while many of the other apps are RF5, which may also be contributing.

2.1 cassandra 1 node down produces replica shortfall

Reply via email to