edit: 4 is bad at small cluster sizes and could scare off adoption On Fri, Jan 31, 2020 at 12:15 PM Carl Mueller <carl.muel...@smartthings.com> wrote:
> "large/giant clusters and admins are the target audience for the value we > select" > > There are reasons aside from massive scale to pick cassandra, but the > primary reason cassandra is selected technically is to support vertically > scaling to large clusters. > > Why pick a value that once you reach scale you need to switch token count? > It's still a ticking time bomb, although 16 won't be what 256 is. > > Hmmmm. But 4 is bad and could scare off adoption. > > Ultimately a well-written article on operations and how to transition from > 16 --> 4 and at what point that is a good idea (aka not when your cluster > is too big) should be a critical part of this. > > On Fri, Jan 31, 2020 at 11:45 AM Michael Shuler <mich...@pbandjelly.org> > wrote: > >> On 1/31/20 9:58 AM, Dimitar Dimitrov wrote: >> > one corollary of the way the algorithm works (or more >> > precisely might not work) with multiple seeds or simultaneous >> > multi-node bootstraps or decommissions, is that a lot of dtests >> > start failing due to deterministic token conflicts. I wasn't >> > able to fix that by changing solely ccm and the dtests >> I appreciate all the detailed discussion. For a little historic context, >> since I brought up this topic in the contributors zoom meeting, unstable >> dtests was precisely the reason we moved the dtest configurations to >> 'num_tokens: 32'. That value has been used in CI dtest since something >> like 2014, when we found that this helped stabilize a large segment of >> flaky dtest failures. No real science there, other than "this hurts less." >> >> I have no real opinion on the suggestions of using 4 or 16, other than I >> believe most "default config using" new users are starting with smaller >> numbers of nodes. The small-but-growing users and veteran large cluster >> admins should be gaining more operational knowledge and be able to >> adjust their own config choices according to their needs (and good >> comment suggestions in the yaml). Whatever default config value is >> chosen for num_tokens, I think it should suit the new users with smaller >> clusters. The suggestion Mick makes that 16 makes a better choice for >> small numbers of nodes, well, that would seem to be the better choice >> for those users we are trying to help the most with the default. >> >> I fully agree that science, maths, and support/ops experience should >> guide the choice, but I don't believe that large/giant clusters and >> admins are the target audience for the value we select. >> >> -- >> Kind regards, >> Michael >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: dev-h...@cassandra.apache.org >> >>