Re: balancing load

2011-01-18 Thread Karl Hiramoto
On 17/01/2011 19:27, Edward Capriolo wrote: cfstats is reporting you have an 8GB Row! I think you could be writing all your data to a few keys. Your right, my n00b fault, I was writing everything to one key, the problem was i had Offer['id'][$UID] = value it made it easy before to do a "c

Re: balancing load

2011-01-17 Thread Edward Capriolo
On Mon, Jan 17, 2011 at 1:20 PM, Karl Hiramoto wrote: > On 01/17/11 15:54, Edward Capriolo wrote: >> Just to head the next possible problem. If you run 'nodetool cleanup' >> on each node and some of your nodes still have more data then others, >> then it probably means your are writing the majorit

Re: balancing load

2011-01-17 Thread Karl Hiramoto
On 01/17/11 15:54, Edward Capriolo wrote: > Just to head the next possible problem. If you run 'nodetool cleanup' > on each node and some of your nodes still have more data then others, > then it probably means your are writing the majority of data to a few > keys. ( you probably do not want to do

Re: balancing load

2011-01-17 Thread Peter Schuller
> @Peter Isn't clean up a special case of compaction? IE it works as a > major compaction + removes data not belonging to the node? Yes, sorry. Brain lapse. Ignore my. -- / Peter Schuller

Re: balancing load

2011-01-17 Thread Edward Capriolo
On Mon, Jan 17, 2011 at 10:51 AM, Peter Schuller wrote: >> Just to head the next possible problem. If you run 'nodetool cleanup' >> on each node and some of your nodes still have more data then others, >> then it probably means your are writing the majority of data to a few >> keys. ( you probably

Re: balancing load

2011-01-17 Thread Peter Schuller
> Just to head the next possible problem. If you run 'nodetool cleanup' > on each node and some of your nodes still have more data then others, > then it probably means your are writing the majority of data to a few > keys. ( you probably do not want to do that ) It may also be that a compact is n

Re: balancing load

2011-01-17 Thread Edward Capriolo
On Mon, Jan 17, 2011 at 2:44 AM, aaron morton wrote: > The nodes will not automatically delete stale data, to do that you need to > run nodetool cleanup. > > See step 3 in the Range Changes > Bootstrap > http://wiki.apache.org/cassandra/Operations#Range_changes > > If you are feeling paranoid be

Re: balancing load

2011-01-16 Thread aaron morton
The nodes will not automatically delete stale data, to do that you need to run nodetool cleanup. See step 3 in the Range Changes > Bootstrap http://wiki.apache.org/cassandra/Operations#Range_changes If you are feeling paranoid before hand, you could run nodetool repair on each node in turn to

RE: balancing load

2011-01-16 Thread raoyixuan (Shandy)
You can issue the nodetool cleanup to clean up the data in old nodes. -Original Message- From: Karl Hiramoto [mailto:k...@hiramoto.org] Sent: Monday, January 17, 2011 3:34 PM To: user@cassandra.apache.org Subject: Re: balancing load Thanks for the help. I used "nodetool move&quo

Re: balancing load

2011-01-16 Thread Karl Hiramoto
Thanks for the help. I used "nodetool move", so now each node owns 20% of the space, but it seems that the data load is still mostly on 2 nodes. nodetool --host slave4 ring Address Status State LoadOwns Token

Re: balancing load

2011-01-16 Thread Peter Schuller
> So for full cluster balance required invoke nodetool move sequential over > all tokens? For a new cluster, the recommended method is to pre-calculate the tokens and bring nodes up with appropriate tokens. For existing clusters, it depends. E.g. if you're doubling the amount of nodes you can jus

Re: balancing load

2011-01-16 Thread ruslan usifov
2011/1/16 Edward Capriolo > On Sun, Jan 16, 2011 at 11:45 AM, Karl Hiramoto wrote: > > Hi, > > > > I have a keyspace with Replication Factor: 2 > > and it seems though that most of my data goes to one node. > > > > > > What am I missing to have Cassandra balance more evenly? > > > > ./nodetool

Re: balancing load

2011-01-16 Thread Edward Capriolo
On Sun, Jan 16, 2011 at 11:45 AM, Karl Hiramoto wrote: > Hi, > > I have a keyspace with  Replication Factor: 2 > and it seems though that most of my data goes to one node. > > > What am I missing to have Cassandra balance more evenly? > > ./nodetool  -h host1 ring > Address         Status State  

Re: balancing load

2011-01-16 Thread Mark Zitnik
Hi, if you are starting the cluster at once and not adding nodes to existed cluster try to calc the tokens. here is a python script to calc the tokens def tokens(nodes): - for x in xrange(nodes): - print 2 ** 127 / nodes * x also read the operation section in cassandra wiki http://wik