Apart from being heavy load (the compact), will it have other effects?
Also, will cleanup help if I have replication factor = number of nodes?
On Wed, Oct 10, 2012 at 6:12 PM, B. Todd Burruss wrote:

> major compaction in production is fine, however it is a heavy operation on
> the node and will take I/O and some CPU.
> the only time i have seen this happen is when i have changed the tokens in
> the ring, like "nodetool movetoken".  cassandra does not auto-delete data
> that it doesn't use anymore just in case you want to move the tokens again
> or otherwise "undo".
> try "nodetool cleanup"
On Wed, Oct 10, 2012 at 2:01 AM, Alain RODRIGUEZ wrote:
>> Hi,
>> Same thing here:
>> 2 nodes, RF = 2. RCL = 1, WCL = 1.
>> Like Tamar I never ran a major compaction and repair once a week each
>> node.
>>    eu-west     1b          Up     Normal  133.02 GB
>> 50.00%              0
>>    eu-west     1b          Up     Normal  98.12 GB
>>  50.00%              85070591730234615865843651857942052864
>> What phenomena could explain the result above ?
>> By the way, I have copy the data and import it in a one node dev cluster.
>> There I have run a major compaction and the size of my data has been
>> significantly reduced (to about 32 GB instead of 133 GB).
>> How is that possible ?
>> Do you think that if I run major compaction in both nodes it will balance
>> the load evenly ?
>> Should I run major compaction in production ?
2012/10/10 Tamar Fraenkel
>>> Hi!
>>> I am re-posting this, now that I have more data and still *unbalanced
>>> ring*:
>>> 3 nodes,
>>> Address         DC          Rack        Status State   Load
>>> Owns    Token
>>> 113427455640312821154458202477256070485
>>> x.x.x.x    us-east     1c          Up     Normal  24.02 GB
>>> 33.33%  0
>>> y.y.y.y     us-east     1c          Up     Normal  33.45 GB
>>> 33.33%  56713727820156410577229101238628035242
>>> z.z.z.z    us-east     1c          Up     Normal  29.85 GB
>>> 33.33%  113427455640312821154458202477256070485
>>> repair runs weekly.
>>> I don't run nodetool compact as I read that this may cause the minor
>>> regular compactions not to run and then I will have to run compact
>>> manually. Is that right?
>>> Any idea if this means something wrong, and if so, how to solve?
>>> Thanks,
On Tue, Mar 27, 2012 at 9:12 AM, Tamar Fraenkel wrote:
>>>> Thanks, I will wait and see as data accumulates.
>>>> Thanks,
On Tue, Mar 27, 2012 at 9:00 AM, R. Verlangen wrote:
>>>>> Cassandra is built to store tons and tons of data. In my opinion
>>>>> roughly ~ 6MB per node is not enough data to allow it to become a fully
>>>>> balanced cluster.
2012/3/27 Tamar Fraenkel
>>>>>> This morning I have
>>>>>>  nodetool ring -h localhost
>>>>>> Address         DC          Rack        Status State   Load
>>>>>>  Owns    Token
>>>>>>          113427455640312821154458202477256070485
>>>>>>    us-east     1c          Up     Normal  5.78 MB
>>>>>>   33.33%  0
>>>>>>   us-east     1c          Up     Normal  7.23 MB
>>>>>>   33.33%  56713727820156410577229101238628035242
>>>>>>    us-east     1c          Up     Normal  5.02 MB
>>>>>>   33.33%  113427455640312821154458202477256070485
>>>>>> Version is 1.0.8.
On Tue, Mar 27, 2012 at 4:05 AM, Maki Watanabe wrote:
>>>>>> watanabe.m...@gmail.com> wrote:
>>>>>>> What version are you using?
>>>>>>> Anyway try nodetool repair & compact.
>>>>>>> maki
2012/3/26 Tamar Fraenkel
>>>>>>>> Hi!
>>>>>>>> I created Amazon ring using datastax image and started filling the
>>>>>>>> db.
>>>>>>>> The cluster seems un-balanced.
>>>>>>>> nodetool ring returns:
>>>>>>>> Address         DC          Rack        Status State   Load
>>>>>>>>    Owns    Token
>>>>>>>>            113427455640312821154458202477256070485
>>>>>>>>    us-east     1c          Up     Normal  514.29 KB
>>>>>>>>     33.33%  0
>>>>>>>>   us-east     1c          Up     Normal  1.5 MB
>>>>>>>>    33.33%  56713727820156410577229101238628035242
>>>>>>>>    us-east     1c          Up     Normal  1.5 MB
>>>>>>>>    33.33%  113427455640312821154458202477256070485
>>>>>>>> [default@tok] describe;
>>>>>>>> Keyspace: tok:
>>>>>>>>   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
>>>>>>>>   Durable Writes: true
>>>>>>>>     Options: [replication_factor:2]
>>>>>>>> [default@tok] describe cluster;
>>>>>>>> Cluster Information:
>>>>>>>>    Snitch: org.apache.cassandra.locator.Ec2Snitch
>>>>>>>>    Partitioner: org.apache.cassandra.dht.RandomPartitioner
>>>>>>>>    Schema versions:
>>>>>>>>         4687d620-7664-11e1-0000-1bcb936807ff: [,
>>>>>>>> Any idea what is the cause?
>>>>>>>> I am running similar code on local ring and it is balanced.
>>>>>>>> How can I fix this?
>>>>>>>> Thanks,
