I wanted to start a discussion about the default for num_tokens that we'd like 
for people starting in Cassandra 4.0.  This is for ticket CASSANDRA-13701 
<https://issues.apache.org/jira/browse/CASSANDRA-13701> (which has been 
duplicated a number of times, most recently by me).

TLDR, based on availability concerns, skew concerns, operational concerns, and 
based on the fact that the new allocation algorithm can be configured fairly 
simply now, this is a proposal to go with 4 as the new default and the 
allocate_tokens_for_local_replication_factor set to 3.  That gives a good 
experience out of the box for people and is the most conservative.  It does 
assume that racks and DCs have been configured correctly.  We would, of course, 
go into some detail in the NEWS.txt.

Joey Lynch and Josh Snyder did an extensive analysis of availability concerns 
with high num_tokens/virtual nodes in their paper 
<http://mail-archives.apache.org/mod_mbox/cassandra-dev/201804.mbox/%3CCALShVHcz5PixXFO_4bZZZNnKcrpph-=5QmCyb0M=w-mhdyl...@mail.gmail.com%3E>.
  This worsens as clusters grow larger.  I won't quote the paper here but in 
order to have a conservative default and with the accompanying new allocation 
algorithm, I think it makes sense as a default.

The difficulties have always been that virtual nodes have been beneficial for 
operations but that 256 is too high for the purposes of repair and as Joey and 
Josh cover, for availability.  Going lower with the original allocation 
algorithm has produced skew in allocation in its naive distribution.  Enter 
CASSANDRA-7032 <https://issues.apache.org/jira/browse/CASSANDRA-7032> and the 
new token allocation algorithm.  CASSANDRA-15260 
<https://issues.apache.org/jira/browse/CASSANDRA-15260> makes the new algorithm 
operationally simpler.

One other item of note - since Joey and Josh's analysis, there have been 
improvements in streaming and other considerations that can reduce the 
probability of more than one node representing some token range being 
unavailable, but it would still be good to be conservative.

Please chime in with any concerns with having num_tokens=4 and 
allocate_tokens_for_local_replication_factor=3 and the accompanying rationale 
so we can improve the experience for all users.

Other resources:
https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_enterprise/config/configVnodes.html
https://www.datastax.com/blog/2016/01/new-token-allocation-algorithm-cassandra-30

Reply via email to