I think Mick and Anthony make some valid operational and skew points for smaller/starting clusters with 4 num_tokens. There’s an arbitrary line between small and large clusters but I think most would agree that most clusters are on the small to medium side. (A small nuance is afaict the probabilities have to do with quorum on a full token range, ie it has to do with the size of a datacenter not the full cluster
As I read this discussion I’m personally more inclined to go with 16 for now. It’s true that if we could fix the skew and topology gotchas for those starting things up, 4 would be ideal from an availability perspective. However we’re still in the brainstorming stage for how to address those challenges. I think we should create tickets for those issues and go with 16 for 4.0. This is about an out of the box experience. It balances availability, operations (such as skew and general bootstrap friendliness and streaming/repair), and cluster sizing. Balancing all of those, I think for now I’m more comfortable with 16 as the default with docs on considerations and tickets to unblock 4 as the default for all users. >>> On Feb 1, 2020, at 6:30 AM, Jeff Jirsa <jji...@gmail.com> wrote: >> On Fri, Jan 31, 2020 at 11:25 AM Joseph Lynch <joe.e.ly...@gmail.com> wrote: >> I think that we might be bikeshedding this number a bit because it is easy >> to debate and there is not yet one right answer. > > > https://www.youtube.com/watch?v=v465T5u9UKo --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org