We only were able to scale out four nodes and then failures started occurring, including multiple instances of nodes joining a cluster without streaming.
Sigh. On Tue, Jun 11, 2019 at 3:11 PM Carl Mueller <carl.muel...@smartthings.com> wrote: > We had a three-DC (asia-tokyo/europe/us) cassandra 2.2.13 cluster, AWS, > IPV6 > > Needed to scale out the asia datacenter, which was 5 nodes, europe and us > were 25 nodes > > We were running into bootstrapping issues where the new node failed to > bootstrap/stream, it failed with > > "java.lang.RuntimeException: A node required to move the data consistently > is down" > > ...even though they were all up based on nodetool status prior to adding > the node. > > First we increased the phi_convict_threshold to 12, and that did not help. > > CASSANDRA-12281 appeared similar to what we had problems with, but I don't > think we hit that. Somewhere in there someone wrote > > "For us, the workaround is either deleting the data (then bootstrap > again), or increasing the ring_delay_ms. And the larger the cluster is, the > longer ring_delay_ms is needed. Based on our tests, for a 40 nodes cluster, > it requires ring_delay_ms to be >50seconds. For a 70 nodes cluster, > >100seconds. Default is 30seconds." > > Given the WAN nature or our DCs, we used ring_delay_ms to 100 seconds and > it finally worked. > > side note: > > During the rolling restarts for setting phi_convict_threshold we observed > quite a lot of status map variance between nodes (we have a program to poll > all of a datacenter or cluster's view of the gossipinfo and statuses. AWS > appears to have variance in networking based on the phi_convict_threshold > advice, I'm not sure if our difficulties were typical in that regard and/or > if our IPV6 and/or globally distributed datacenters were exacerbating > factors. > > We could not reproduce this in loadtest, although loadtest is only eu and > us (but is IPV6) >