> One question on nodetool ring, the "owns" refers to how many of the possible > keys each node owns, not the actual node size correct? yes
> So you could technically have a load of 15gb, 60gb, and 15gb on a three node > cluster, but if you have the tokens set correctly each would own 33.33%. Yes Possible reasons for the load to be unbalanced even thought he tokens are evenly distributed include: * not running cleanup after moving tokens * some very large rows, would only apply when RF > cluster size Cheers ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 18/08/2011, at 12:01 AM, David McNelis wrote: > Well, I think what happened was that we had three tokens generated, 0, 567x, > and 1134x... but the way that we read the comments in the yaml file, we just > set the second two nodes with the initial token and left the token for the > seed node blank. Then we started the seed node, started the other two, and > then the seed node took on the 613x token. > > One question on nodetool ring, the "owns" refers to how many of the possible > keys each node owns, not the actual node size correct? So you could > technically have a load of 15gb, 60gb, and 15gb on a three node cluster, but > if you have the tokens set correctly each would own 33.33%. > > Thanks. > > On Tue, Aug 16, 2011 at 3:33 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > Yes, that looks about right. > > Totally baffled how the wiki script could spit out those tokens for a > 3-node cluster. > > On Tue, Aug 16, 2011 at 2:04 PM, David McNelis > <dmcne...@agentisenergy.com> wrote: > > Currently we have the initial_token for the seed node blank, and then the > > three tokens we ended up with are: > > 56713727820156410577229101238628035242 > > 61396109050359754194262152792166260437 > > 113427455640312821154458202477256070485 > > I would assume that we'd want to take the node that > > is 61396109050359754194262152792166260437 and move it to 0, yes? > > In theory that should largely balance out our data... or am I missing > > something there? > > On Tue, Aug 16, 2011 at 1:54 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > >> > >> what tokens did you end up using? > >> > >> are you sure it's actually due to different amounts of rows? have you > >> run cleanup and compact to make sure it's not unused data / obsolete > >> replicas taking up the space? > >> > >> On Tue, Aug 16, 2011 at 1:41 PM, David McNelis > >> <dmcne...@agentisenergy.com> wrote: > >> > We are currently running a three node cluster where we assigned the > >> > initial > >> > tokens using the Python script that is in the Wiki, and we're currently > >> > using the Random Partitioner, RF=1, Cassandra 0.8 from the Riptano RPM > >> > ....however we're seeing one node taken on over 60% of the data as we > >> > load > >> > data. > >> > Our keys are sequential, and can range from 0 to 2^64, though in > >> > practice > >> > we're between 1 and 2,000,000,000, with the current max around 50,000. > >> > In > >> > order to balance out the load would we be best served changing our > >> > tokens > >> > to make the top and bottom 1/3rd of the node go to the previous and next > >> > nodes respectively, then running nodetool move? > >> > Even if we do that, it would seem that we'd likely continue to run into > >> > this > >> > sort of issue as we add additionally data... would we be better > >> > served > >> > with a different Partitioner strategy? Or will we need to very actively > >> > manage our tokens to avoid getting into an unbalanced situation? > >> > > >> > -- > >> > David McNelis > >> > Lead Software Engineer > >> > Agentis Energy > >> > www.agentisenergy.com > >> > o: 630.359.6395 > >> > c: 219.384.5143 > >> > A Smart Grid technology company focused on helping consumers of energy > >> > control an often under-managed resource. > >> > > >> > > >> > >> > >> > >> -- > >> Jonathan Ellis > >> Project Chair, Apache Cassandra > >> co-founder of DataStax, the source for professional Cassandra support > >> http://www.datastax.com > > > > > > > > -- > > David McNelis > > Lead Software Engineer > > Agentis Energy > > www.agentisenergy.com > > o: 630.359.6395 > > c: 219.384.5143 > > A Smart Grid technology company focused on helping consumers of energy > > control an often under-managed resource. > > > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com > > > > -- > David McNelis > Lead Software Engineer > Agentis Energy > www.agentisenergy.com > o: 630.359.6395 > c: 219.384.5143 > > A Smart Grid technology company focused on helping consumers of energy > control an often under-managed resource. > >