We have a exact idea of the amount of data we'll be storing, and the kinds of machines we'll be storing them on. The simple math of (total data we'll be storing 6 months from now) / (total capacity of a single node) * (number of duplicates of each datum we'd like to store for redundancy) gives us a number a concrete number of machines we'll need, whether we're using Riak or something else...
However, that's a little off track from the question I'm trying to answer. Which is: Why is it when I start two nodes with a large-ish ring_creation_size -- as soon as I join the second node to the first, CPU and memory usage of both nodes goes through the roof? No data stored yet. This even happens with a ring_creation_size I wouldn't consider huge, like 4096. So that is the small configuration I'm starting with and that is the limit I'm hitting. I realize that the feasibility of building out a single 1000 node cluster is a larger question. But I can tell you we'll get there; having to shard across 10 100-node clusters is an option. Regardless, I'd like to have an understanding of what the resource overhead of each vnode is... On Thursday, April 14, 2011 at 6:38 AM, Sean Cribbs wrote: Good points, Dave. > > Also, it's worth mentioning that we've seen that many customers and > open-source users think they will need many more nodes than they actually do. > Many are able to start with 5 nodes and are happy for quite a while. The only > way to tell what you actually need is to start with a baseline configuration > and simulate some percentage above your current load. Once you've figured out > what size that initial cluster is, start with (number of nodes) * 50 as the > ring_creation_size (rounded to the nearest power of 2 of course). This gives > you a growth factor of about 5 before you need to consider changing. > > As well, there's some ops "common sense" that says the lifetime of any single > architecture is 18 months or less. That doesn't necessarily mean that you'll > be building a new cluster with a larger ring size in 18 months, but just that > your needs will be different at that time and are hard to predict. Plan for > now, worry about the 1000 node cluster when you actually need it. > > Sean Cribbs <s...@basho.com> > Developer Advocate > Basho Technologies, Inc. > http://basho.com/ > > > On Apr 14, 2011, at 9:09 AM, Dave Barnes wrote: > > Sorry I feel compelled to chime in. > > > > Maybe you could assess your physical node limits and start with a small > > configuration, then increase it and increase it until you hit a limit. > > > > Work small to large. > > > > Once you find the pain point, lets us know what resource ran out. > > > > You will learn a lot along the way on how your servers behave and we'll > > discover a lot when you share the results. > > > > Thanks for digging in, > > > > Dave > > > > On Wed, Apr 13, 2011 at 5:11 PM, Greg Nelson <gro...@dropcam.com> wrote: > > > Ok, how about in this case I described? It runs out of memory with a > > > single pair of nodes... > > > > > > (Or did you mean there's a connection between each pair of vnodes?) > > > On Wednesday, April 13, 2011 at 1:56 PM, Jon Meredith wrote: > > > > Hi Greg et al, > > > > > > > > As you say largest known is not largest possible. Internally within > > > > Basho, the largest cluster we've experimented with so far had 50 nodes. > > > > > > > > Going beyond that it's speculation from me about pain points. > > > > > > > > 1) It is true that you need enough file descriptors to start up all > > > > partitions when a node restarts - Riak checks if there is any handoff > > > > data pending for each partition. We have work scheduled to address that > > > > in the medium term. The plan is to only spin up partitions the node > > > > owns and any that have been started as fallbacks that handoff has not > > > > completed for. Until that work is done you will need a high ulimit with > > > > large ring sizes. > > > > > > > > 2) It is also true that Erlang runs a fully connected network, so there > > > > will be connections between each node pair in the cluster. We haven't > > > > determined the point at which it becomes a problem. > > > > > > > > So it looks like you'll be pushing the known limits. Basho will do our > > > > very best to help overcome any obstacles as you encounter them. > > > > > > > > Jon Meredith > > > > Basho Technologies. > > > > > > > > On Wed, Apr 13, 2011 at 1:41 PM, Greg Nelson <gro...@dropcam.com> wrote: > > > > > The largest known riak cluster != the largest possible riak cluster. > > > > > ;-) > > > > > > > > > > The inter node communication of the cluster depends on the data set > > > > > and usage pattern, doesn't it? Or is there some constant overhead > > > > > that tops out at a few hundred nodes? I should point out that we'll > > > > > have big data, but not a huge number of keys. > > > > > > > > > > The number of vnodes in the cluster should be equal to the > > > > > ring_creation_size under normal circumstances, shouldn't it? So when > > > > > I have a one node cluster, that node is running ring_creation_size > > > > > vnodes... File descriptors probably isn't a problem -- these machines > > > > > won't be doing anything else, and the limits are set to 65536. > > > > > > > > > > Thinking about the internode communication you mentioned, that's > > > > > probably where the resource hog is.. socket buffers, etc. > > > > > > > > > > Anyway, I'd also love to hear more from basho. :) > > > > > On Wednesday, April 13, 2011 at 12:33 PM, sicul...@gmail.com wrote: > > > > > > Ill just chime in and say that this is not practical for a few > > > > > > reasons. The largest known riak cluster has like 50 or 60 nodes. > > > > > > Afaik, inter node communication of erlang clusters top out at a few > > > > > > hundred nodes. I'm also under the impression that each physical > > > > > > node has to have enough file descriptors to accommodate every > > > > > > virtual node in the cluster. > > > > > > > > > > > > I'd love to hear more from basho. > > > > > > > > > > > > -alexander > > > > > > > > > > > > > > > > > > Sent from my Verizon Wireless BlackBerry > > > > > > > > > > > > -----Original Message----- > > > > > > From: Greg Nelson <gro...@dropcam.com> > > > > > > Sender: riak-users-boun...@lists.basho.com > > > > > > Date: Wed, 13 Apr 2011 12:13:34 > > > > > > To: <riak-users@lists.basho.com> > > > > > > Subject: Large ring_creation_size > > > > > > > > > > > > _______________________________________________ > > > > > > riak-users mailing list > > > > > > riak-users@lists.basho.com > > > > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > > > > > > > > > > > _______________________________________________ > > > > > riak-users mailing list > > > > > riak-users@lists.basho.com > > > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > riak-users mailing list > > > riak-users@lists.basho.com > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > > _______________________________________________ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com