Re: Partitioning, tokens, and sequential keys

2011-08-17 Thread aaron morton
> One question on nodetool ring, the "owns" refers to how many of the possible > keys each node owns, not the actual node size correct? yes > So you could technically have a load of 15gb, 60gb, and 15gb on a three node > cluster, but if you have the tokens set correctly each would own 33.33%. Ye

Re: Partitioning, tokens, and sequential keys

2011-08-17 Thread David McNelis
Well, I think what happened was that we had three tokens generated, 0, 567x, and 1134x... but the way that we read the comments in the yaml file, we just set the second two nodes with the initial token and left the token for the seed node blank. Then we started the seed node, started the other

Re: Partitioning, tokens, and sequential keys

2011-08-16 Thread Jonathan Ellis
Yes, that looks about right. Totally baffled how the wiki script could spit out those tokens for a 3-node cluster. On Tue, Aug 16, 2011 at 2:04 PM, David McNelis wrote: > Currently we have the initial_token for the seed node blank, and then the > three tokens we ended  up with are: > 56713727820

Re: Partitioning, tokens, and sequential keys

2011-08-16 Thread David McNelis
Currently we have the initial_token for the seed node blank, and then the three tokens we ended up with are: 56713727820156410577229101238628035242 61396109050359754194262152792166260437 113427455640312821154458202477256070485 I would assume that we'd want to take the node that is 613961090503597

Re: Partitioning, tokens, and sequential keys

2011-08-16 Thread Jonathan Ellis
what tokens did you end up using? are you sure it's actually due to different amounts of rows? have you run cleanup and compact to make sure it's not unused data / obsolete replicas taking up the space? On Tue, Aug 16, 2011 at 1:41 PM, David McNelis wrote: > We are currently running a three nod

Partitioning, tokens, and sequential keys

2011-08-16 Thread David McNelis
We are currently running a three node cluster where we assigned the initial tokens using the Python script that is in the Wiki, and we're currently using the Random Partitioner, RF=1, Cassandra 0.8 from the Riptano RPM however we're seeing one node taken on over 60% of the data as we load data.