I think lowering the number of tokens is a great idea! Similar to Jon, when I have reduced the number of tokens for clients it has been improvement in repair performance.
I am concerned that the proposed default value for num_tokens is too low. If you set up a cluster using the proposed defaults, you will get a balanced cluster. However, if you decommission nodes you will start to see large imbalances especially for small clusters (< 20 nodes). This is because the allocate_tokens_for_local_replication_factor setting is only applied during the bootstrap process. I have recommended very low values for num_tokens to clients. This was because it was very unlikely that they would reduce their cluster size and I warned them of the caveats with using a small value for num_tokens. The proposed num_token default value is fine for devs and operators that know what they are doing. However, the general Cassandra community will be unaware of the potential issue with such a low value. We should consider setting num_tokens to 16 - 32 as the default. This will at least help reduce the severity of the imbalance when decommissioning a node whilst still providing the benefits of having a low number of tokens. In addition, we can add a comment to num_tokens that clusters over 100 nodes (per datacenter) should consider reducing it down to 4. Cheers, Anthony On Fri, 31 Jan 2020 at 01:58, Jon Haddad <j...@jonhaddad.com> wrote: > Larger clusters is where high token counts do the most damage. That's why > it's such a problem. You start out with a small cluster using 256, as you > grow into the hundreds it becomes more and more unstable. > > > On Thu, Jan 30, 2020, 8:19 AM onmstester onmstester > <onmstes...@zoho.com.invalid> wrote: > > > Shouldn't we consider the cluster size to configure num_tokens? > > > > For example is it OK to use num_tokens=4 for a cluster of more than 100 > of > > nodes? > > > > > > > > Another question that is not so much relevant to this : > > > > When we use the token assignment algorithm (the new/non-random one) for a > > specific keyspace, why should we use initial token for all the seeds, > isn't > > one seed enough and then just set the keyspace for all other nodes? > > > > > > > > Also i do not understand why should we consider rack topology and number > > of racks for configuration of num_tokens? > > > > > > > > Sent using https://www.zoho.com/mail/ > > > > > > > > > > ---- On Thu, 30 Jan 2020 04:33:57 +0330 Jeremy Hanna < > > jeremy.hanna1...@gmail.com> wrote ---- > > > > > > The new default wouldn't be retroactively set for 3.x, but the same > > principles apply. The new algorithm is in 3.x as well as the > > simplification of the configuration. So no reason not to use the same > > configuration on 3.x. > > > > > On Jan 30, 2020, at 4:34 AM, Chen-Becker, Derek <mailto: > > dchen...@amazon.com.INVALID> wrote: > > > > > > Does the same guidance apply to 3.x clusters? I read through the JIRA > > ticket linked below, along with tickets that it links to, but it's not > > clear that the new allocation algorithm is available in 3.x or if there > are > > other reasons that this would be problematic. > > > > > > Thanks, > > > > > > Derek > > > > > > On 1/29/20, 9:54 AM, "Jon Haddad" <mailto:j...@jonhaddad.com> wrote: > > > > > > Ive put a lot of my previous clients on 4 tokens, all of which have > > > resulted in a major improvement. > > > > > > I wouldn't use any more than 4 except under some pretty unusual > > > circumstances. > > > > > > Jon > > > > > > On Wed, Jan 29, 2020, 11:18 AM Ben Bromhead <mailto: > > b...@instaclustr.com> wrote: > > > > > >> +1 to reducing the number of tokens as low as possible for > availability > > >> issues. 4 lgtm > > >> > > >> On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi <mailto: > djo...@apache.org> > > wrote: > > >> > > >>> Thanks for restarting this discussion Jeremy. I personally think 4 is > > a > > >>> good number as a default. I think whatever we pick, we should have > > enough > > >>> documentation for operators to make sense of the new defaults in 4.0. > > >>> > > >>> Dinesh > > >>> > > >>>> On Jan 28, 2020, at 9:25 PM, Jeremy Hanna <mailto: > > jeremy.hanna1...@gmail.com> > > >>> wrote: > > >>>> > > >>>> I wanted to start a discussion about the default for num_tokens that > > >>> we'd like for people starting in Cassandra 4.0. This is for ticket > > >>> CASSANDRA-13701 < > https://issues.apache.org/jira/browse/CASSANDRA-13701> > > > > >>> (which has been duplicated a number of times, most recently by me). > > >>>> > > >>>> TLDR, based on availability concerns, skew concerns, operational > > >>> concerns, and based on the fact that the new allocation algorithm can > > be > > >>> configured fairly simply now, this is a proposal to go with 4 as the > > new > > >>> default and the allocate_tokens_for_local_replication_factor set to > 3. > > >>> That gives a good experience out of the box for people and is the > most > > >>> conservative. It does assume that racks and DCs have been configured > > >>> correctly. We would, of course, go into some detail in the NEWS.txt. > > >>>> > > >>>> Joey Lynch and Josh Snyder did an extensive analysis of availability > > >>> concerns with high num_tokens/virtual nodes in their paper < > > >>> > > >> > > > http://mail-archives.apache.org/mod_mbox/cassandra-dev/201804.mbox/%3CCALShVHcz5PixXFO_4bZZZNnKcrpph-=5QmCyb0M=w-mhdyl...@mail.gmail.com%3E > > >>> . > > >>> This worsens as clusters grow larger. I won't quote the paper here > > but > > >> in > > >>> order to have a conservative default and with the accompanying new > > >>> allocation algorithm, I think it makes sense as a default. > > >>>> > > >>>> The difficulties have always been that virtual nodes have been > > >>> beneficial for operations but that 256 is too high for the purposes > of > > >>> repair and as Joey and Josh cover, for availability. Going lower > with > > >> the > > >>> original allocation algorithm has produced skew in allocation in its > > >> naive > > >>> distribution. Enter CASSANDRA-7032 < > > >>> https://issues.apache.org/jira/browse/CASSANDRA-7032> and the new > > token > > >>> allocation algorithm. CASSANDRA-15260 < > > >>> https://issues.apache.org/jira/browse/CASSANDRA-15260> makes the new > > >>> algorithm operationally simpler. > > >>>> > > >>>> One other item of note - since Joey and Josh's analysis, there have > > >> been > > >>> improvements in streaming and other considerations that can reduce > the > > >>> probability of more than one node representing some token range being > > >>> unavailable, but it would still be good to be conservative. > > >>>> > > >>>> Please chime in with any concerns with having num_tokens=4 and > > >>> allocate_tokens_for_local_replication_factor=3 and the accompanying > > >>> rationale so we can improve the experience for all users. > > >>>> > > >>>> Other resources: > > >>>> > > >>> > > >> > > > https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html > > >>>> > > >>> > > >> > > > https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_enterprise/config/configVnodes.html > > >>>> > > >>> > > >> > > > https://www.datastax.com/blog/2016/01/new-token-allocation-algorithm-cassandra-30 > > >>>> > > >>> > > >>> > > >>> --------------------------------------------------------------------- > > >>> To unsubscribe, e-mail: mailto:dev-unsubscr...@cassandra.apache.org > > >>> For additional commands, e-mail: mailto: > dev-h...@cassandra.apache.org > > >>> > > >>> > > >> > > >> -- > > >> > > >> Ben Bromhead > > >> > > >> Instaclustr | www.instaclustr.com | @instaclustr > > >> <http://twitter.com/instaclustr> | (650) 284 9692 > > >> > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: mailto:dev-unsubscr...@cassandra.apache.org > > > For additional commands, e-mail: mailto:dev-h...@cassandra.apache.org > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: mailto:dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: mailto:dev-h...@cassandra.apache.org >