> > Discussions here and on slack have brought up a number of important > concerns.
Sounds like we're letting the perfect be the enemy of the good. Is anyone arguing that 256 is a better default than 16? Or is the fear that going to 16 now would make a default change in, say, 5.0 more painful? On Tue, Feb 18, 2020 at 3:12 AM Ben Slater <ben.sla...@instaclustr.com> wrote: > In case it helps move the decision along, we moved to 16 vnodes as default > in Nov 2018 and haven't looked back (many clusters from 3-100s of nodes > later). The testing we did in making that decision is summarised here: > https://www.instaclustr.com/cassandra-vnodes-how-many-should-i-use/ > > <https://www.instaclustr.com/cassandra-vnodes-how-many-should-i-use/ > >Cheers > Ben > > --- > > > *Ben Slater**Chief Product Officer* > > <https://www.instaclustr.com/platform/> > > <https://www.facebook.com/instaclustr> <https://twitter.com/instaclustr> > <https://www.linkedin.com/company/instaclustr> > > Read our latest technical blog posts here > <https://www.instaclustr.com/blog/>. > > This email has been sent on behalf of Instaclustr Pty. Limited (Australia) > and Instaclustr Inc (USA). > > This email and any attachments may contain confidential and legally > privileged information. If you are not the intended recipient, do not copy > or disclose its content, but please reply to this email immediately and > highlight the error to the sender and then immediately delete the message. > > > On Tue, 18 Feb 2020 at 18:44, Mick Semb Wever <m...@thelastpickle.com> > wrote: > > > -1 > > > > Discussions here and on slack have brought up a number of important > > concerns. I think those concerns need to be summarised here before any > > informal vote. > > > > It was my understanding that some of those concerns may even be blockers > to > > a move to 16. That is we have to presume the worse case scenario where > all > > tokens get randomly generated. > > > > Can we ask for some analysis and data against the risks different > > num_tokens choices present. We shouldn't rush into a new default, and > such > > background information and data is operator value added. Maybe I missed > any > > info/experiments that have happened? > > > > > > > > On Mon., 17 Feb. 2020, 11:14 pm Jeremy Hanna, < > jeremy.hanna1...@gmail.com> > > wrote: > > > > > I just wanted to close the loop on this if possible. After some > > discussion > > > in slack about various topics, I would like to see if people are okay > > with > > > num_tokens=8 by default (as it's not much different operationally than > > > 16). Joey brought up a few small changes that I can put on the ticket. > > It > > > also requires some documentation for things like decommission order and > > > skew. > > > > > > Are people okay with this change moving forward like this? If so, I'll > > > comment on the ticket and we can move forward. > > > > > > Thanks, > > > > > > Jeremy > > > > > > On Tue, Feb 4, 2020 at 8:45 AM Jon Haddad <j...@jonhaddad.com> wrote: > > > > > > > I think it's a good idea to take a step back and get a high level > view > > of > > > > the problem we're trying to solve. > > > > > > > > First, high token counts result in decreased availability as each > node > > > has > > > > data overlap with with more nodes in the cluster. Specifically, a > node > > > can > > > > share data with RF-1 * 2 * num_tokens. So a 256 token cluster at > RF=3 > > is > > > > going to almost always share data with every other node in the > cluster > > > that > > > > isn't in the same rack, unless you're doing something wild like using > > > more > > > > than a thousand nodes in a cluster. We advertise > > > > > > > > With 16 tokens, that is vastly improved, but you still have up to 64 > > > nodes > > > > each node needs to query against, so you're again, hitting every node > > > > unless you go above ~96 nodes in the cluster (assuming 3 racks / > > AZs). I > > > > wouldn't use 16 here, and I doubt any of you would either. I've > > > advocated > > > > for 4 tokens because you'd have overlap with only 16 nodes, which > works > > > > well for small clusters as well as large. Assuming I was creating a > > new > > > > cluster for myself (in a hypothetical brand new application I'm > > > building) I > > > > would put this in production. I have worked with several teams > where I > > > > helped them put 4 token clusters in prod and it has worked very well. > > We > > > > didn't see any wild imbalance issues. > > > > > > > > As Mick's pointed out, our current method of using random token > > > assignment > > > > for the default number of problematic for 4 tokens. I fully agree > with > > > > this, and I think if we were to try to use 4 tokens, we'd want to > > address > > > > this in tandem. We can discuss how to better allocate tokens by > > default > > > > (something more predictable than random), but I'd like to avoid the > > > > specifics of that for the sake of this email. > > > > > > > > To Alex's point, repairs are problematic with lower token counts due > to > > > > over streaming. I think this is a pretty serious issue and I we'd > have > > > to > > > > address it before going all the way down to 4. This, in my opinion, > > is a > > > > more complex problem to solve and I think trying to fix it here could > > > make > > > > shipping 4.0 take even longer, something none of us want. > > > > > > > > For the sake of shipping 4.0 without adding extra overhead and time, > > I'm > > > ok > > > > with moving to 16 tokens, and in the process adding extensive > > > documentation > > > > outlining what we recommend for production use. I think we should > also > > > try > > > > to figure out something better than random as the default to fix the > > data > > > > imbalance issues. I've got a few ideas here I've been noodling on. > > > > > > > > As long as folks are fine with potentially changing the default again > > in > > > C* > > > > 5.0 (after another discussion / debate), 16 is enough of an > improvement > > > > that I'm OK with the change, and willing to author the docs to help > > > people > > > > set up their first cluster. For folks that go into production with > the > > > > defaults, we're at least not setting them up for total failure once > > their > > > > clusters get large like we are now. > > > > > > > > In future versions, we'll probably want to address the issue of data > > > > imbalance by building something in that shifts individual tokens > > > around. I > > > > don't think we should try to do this in 4.0 either. > > > > > > > > Jon > > > > > > > > > > > > > > > > On Fri, Jan 31, 2020 at 2:04 PM Jeremy Hanna < > > jeremy.hanna1...@gmail.com > > > > > > > > wrote: > > > > > > > > > I think Mick and Anthony make some valid operational and skew > points > > > for > > > > > smaller/starting clusters with 4 num_tokens. There’s an arbitrary > > line > > > > > between small and large clusters but I think most would agree that > > most > > > > > clusters are on the small to medium side. (A small nuance is afaict > > the > > > > > probabilities have to do with quorum on a full token range, ie it > has > > > to > > > > do > > > > > with the size of a datacenter not the full cluster > > > > > > > > > > As I read this discussion I’m personally more inclined to go with > 16 > > > for > > > > > now. It’s true that if we could fix the skew and topology gotchas > for > > > > those > > > > > starting things up, 4 would be ideal from an availability > > perspective. > > > > > However we’re still in the brainstorming stage for how to address > > those > > > > > challenges. I think we should create tickets for those issues and > go > > > with > > > > > 16 for 4.0. > > > > > > > > > > This is about an out of the box experience. It balances > availability, > > > > > operations (such as skew and general bootstrap friendliness and > > > > > streaming/repair), and cluster sizing. Balancing all of those, I > > think > > > > for > > > > > now I’m more comfortable with 16 as the default with docs on > > > > considerations > > > > > and tickets to unblock 4 as the default for all users. > > > > > > > > > > >>> On Feb 1, 2020, at 6:30 AM, Jeff Jirsa <jji...@gmail.com> > wrote: > > > > > >> On Fri, Jan 31, 2020 at 11:25 AM Joseph Lynch < > > > joe.e.ly...@gmail.com > > > > > > > > > > wrote: > > > > > >> I think that we might be bikeshedding this number a bit because > it > > > is > > > > > easy > > > > > >> to debate and there is not yet one right answer. > > > > > > > > > > > > > > > > > > https://www.youtube.com/watch?v=v465T5u9UKo > > > > > > > > > > > --------------------------------------------------------------------- > > > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > > > > > > > > > > > > > > >