> One question on nodetool ring, the "owns" refers to how many of the possible 
> keys each node owns, not the actual node size correct?
yes

> So you could technically have a load of 15gb, 60gb, and 15gb on a three node 
> cluster, but if you have the tokens set correctly each would own 33.33%.
Yes
Possible reasons for the load to be unbalanced even thought he tokens are 
evenly distributed include:
* not running cleanup after moving tokens
* some very large rows, would only apply when RF > cluster size
 
Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 18/08/2011, at 12:01 AM, David McNelis wrote:

> Well, I think what  happened was that we had three tokens generated, 0, 567x, 
> and 1134x... but the way that we read the comments in the yaml file, we just 
> set the second two nodes with the initial token and left the token for the 
> seed node  blank.  Then we started the seed node, started the other  two, and 
> then the seed node took on the 613x token.
> 
> One question on nodetool ring, the "owns" refers to how many of the possible 
> keys each node owns, not the actual node size correct?  So you could 
> technically have a load of 15gb, 60gb, and 15gb on a three node cluster, but 
> if you have the tokens set correctly each would own 33.33%.
> 
> Thanks.
> 
> On Tue, Aug 16, 2011 at 3:33 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
> Yes, that looks about right.
> 
> Totally baffled how the wiki script could spit out those tokens for a
> 3-node cluster.
> 
> On Tue, Aug 16, 2011 at 2:04 PM, David McNelis
> <dmcne...@agentisenergy.com> wrote:
> > Currently we have the initial_token for the seed node blank, and then the
> > three tokens we ended  up with are:
> > 56713727820156410577229101238628035242
> > 61396109050359754194262152792166260437
> > 113427455640312821154458202477256070485
> > I would assume that we'd want to take the node that
> > is 61396109050359754194262152792166260437 and move it to 0, yes?
> > In theory that should largely balance out our data... or am I missing
> > something there?
> > On Tue, Aug 16, 2011 at 1:54 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
> >>
> >> what tokens did you end up using?
> >>
> >> are you sure it's actually due to different amounts of rows?  have you
> >> run cleanup and compact to make sure it's not unused data / obsolete
> >> replicas taking up the space?
> >>
> >> On Tue, Aug 16, 2011 at 1:41 PM, David McNelis
> >> <dmcne...@agentisenergy.com> wrote:
> >> > We are currently running a three node cluster where we assigned the
> >> > initial
> >> > tokens using the Python script that is in the Wiki, and we're currently
> >> > using the Random Partitioner, RF=1, Cassandra 0.8 from the Riptano RPM
> >> > ....however we're seeing one node taken on over 60% of the data as we
> >> > load
> >> > data.
> >> > Our keys are sequential, and can range from 0 to 2^64, though in
> >> > practice
> >> > we're between 1 and 2,000,000,000, with the current  max around 50,000.
> >> >   In
> >> > order to balance out the  load would we be best served changing our
> >> > tokens
> >> > to make the top and bottom 1/3rd of the node go to the previous and next
> >> > nodes respectively, then running nodetool move?
> >> > Even if we do that, it would seem that we'd likely continue to run into
> >> > this
> >> > sort of issue as  we  add  additionally data... would we be better
> >> > served
> >> > with a different Partitioner strategy?  Or will we need to very actively
> >> > manage our tokens to avoid getting into an unbalanced situation?
> >> >
> >> > --
> >> > David McNelis
> >> > Lead Software Engineer
> >> > Agentis Energy
> >> > www.agentisenergy.com
> >> > o: 630.359.6395
> >> > c: 219.384.5143
> >> > A Smart Grid technology company focused on helping consumers of energy
> >> > control an often under-managed resource.
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Jonathan Ellis
> >> Project Chair, Apache Cassandra
> >> co-founder of DataStax, the source for professional Cassandra support
> >> http://www.datastax.com
> >
> >
> >
> > --
> > David McNelis
> > Lead Software Engineer
> > Agentis Energy
> > www.agentisenergy.com
> > o: 630.359.6395
> > c: 219.384.5143
> > A Smart Grid technology company focused on helping consumers of energy
> > control an often under-managed resource.
> >
> >
> 
> 
> 
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
> 
> 
> 
> -- 
> David McNelis
> Lead Software Engineer
> Agentis Energy
> www.agentisenergy.com
> o: 630.359.6395
> c: 219.384.5143
> 
> A Smart Grid technology company focused on helping consumers of energy 
> control an often under-managed resource.
> 
> 

Reply via email to