To run, or not to run? All this depends on use case. There're no problems 
running major compactions (we do it nightly) in one case, there could be 
problems in another. Just need to understand, how everything works.


Best regards / Pagarbiai
Viktor Jevdokimov
Senior Developer

Email: viktor.jevdoki...@adform.com<mailto:viktor.jevdoki...@adform.com>
Phone: +370 5 212 3063, Mobile: +370 650 19588, Fax +370 5 261 0453
J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
Follow us on Twitter: @adforminsider<http://twitter.com/#!/adforminsider>
What is Adform: watch this short video<http://vimeo.com/adform/display>

[Adform News] <http://www.adform.com>

Visit us at IAB RTB workshop
October 11, 4 pm in Sala Rossa
[iab forum] <http://www.iabforum.it/iab-forum-milano-2012/agenda/11-ottobre/>


Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the intended recipient, you are reminded that the 
information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
this message and any copies.

From: Alain RODRIGUEZ [mailto:arodr...@gmail.com]
Sent: Thursday, October 11, 2012 09:17
To: user@cassandra.apache.org
Subject: Re: unbalanced ring

Tamar be carefull. Datastax doesn't recommand major compactions in production 
environnement.

If I got it right, performing major compaction will convert all your SSTables 
into a big one, improving substantially your reads performence, at least for a 
while... The problem is that will disable minor compactions too (because of the 
difference of size between this SSTable and the new ones, if I remeber well). 
So your reads performance will decrease until your others SSTable reach the 
size of this big one you've created or until you run an other major compaction, 
transforming them into a maintenance normal process like repair is.

But, knowing that, I still don't know if we both (Tamar and I) shouldn't run it 
anyway (In my case it will greatly decrease the size of my data  133 GB -> 35GB 
and maybe load the cluster evenly...)

Alain

2012/10/10 B. Todd Burruss <bto...@gmail.com<mailto:bto...@gmail.com>>
it should not have any other impact except increased usage of system resources.

and i suppose, cleanup would not have an affect (over normal compaction) if all 
nodes contain the same data

On Wed, Oct 10, 2012 at 12:12 PM, Tamar Fraenkel 
<ta...@tok-media.com<mailto:ta...@tok-media.com>> wrote:
Hi!
Apart from being heavy load (the compact), will it have other effects?
Also, will cleanup help if I have replication factor = number of nodes?
Thanks

Tamar Fraenkel
Senior Software Engineer, TOK Media
[Inline image 1]

ta...@tok-media.com<mailto:ta...@tok-media.com>
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956




On Wed, Oct 10, 2012 at 6:12 PM, B. Todd Burruss 
<bto...@gmail.com<mailto:bto...@gmail.com>> wrote:
major compaction in production is fine, however it is a heavy operation on the 
node and will take I/O and some CPU.

the only time i have seen this happen is when i have changed the tokens in the 
ring, like "nodetool movetoken".  cassandra does not auto-delete data that it 
doesn't use anymore just in case you want to move the tokens again or otherwise 
"undo".

try "nodetool cleanup"

On Wed, Oct 10, 2012 at 2:01 AM, Alain RODRIGUEZ 
<arodr...@gmail.com<mailto:arodr...@gmail.com>> wrote:
Hi,

Same thing here:

2 nodes, RF = 2. RCL = 1, WCL = 1.
Like Tamar I never ran a major compaction and repair once a week each node.

10.59.21.241    eu-west     1b          Up     Normal  133.02 GB       50.00%   
           0
10.58.83.109    eu-west     1b          Up     Normal  98.12 GB        50.00%   
           85070591730234615865843651857942052864

What phenomena could explain the result above ?

By the way, I have copy the data and import it in a one node dev cluster. There 
I have run a major compaction and the size of my data has been significantly 
reduced (to about 32 GB instead of 133 GB).

How is that possible ?
Do you think that if I run major compaction in both nodes it will balance the 
load evenly ?
Should I run major compaction in production ?

2012/10/10 Tamar Fraenkel <ta...@tok-media.com<mailto:ta...@tok-media.com>>
Hi!
I am re-posting this, now that I have more data and still unbalanced ring:

3 nodes,
RF=3, RCL=WCL=QUORUM


Address         DC          Rack        Status State   Load            Owns    
Token
                                                                               
113427455640312821154458202477256070485
x.x.x.x    us-east     1c          Up     Normal  24.02 GB        33.33%  0
y.y.y.y     us-east     1c          Up     Normal  33.45 GB        33.33%  
56713727820156410577229101238628035242
z.z.z.z    us-east     1c          Up     Normal  29.85 GB        33.33%  
113427455640312821154458202477256070485

repair runs weekly.
I don't run nodetool compact as I read that this may cause the minor regular 
compactions not to run and then I will have to run compact manually. Is that 
right?

Any idea if this means something wrong, and if so, how to solve?


Thanks,

Tamar Fraenkel
Senior Software Engineer, TOK Media
[Inline image 1]

ta...@tok-media.com<mailto:ta...@tok-media.com>
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956




On Tue, Mar 27, 2012 at 9:12 AM, Tamar Fraenkel 
<ta...@tok-media.com<mailto:ta...@tok-media.com>> wrote:
Thanks, I will wait and see as data accumulates.
Thanks,

Tamar Fraenkel
Senior Software Engineer, TOK Media
[Inline image 1]

ta...@tok-media.com<mailto:ta...@tok-media.com>
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956




On Tue, Mar 27, 2012 at 9:00 AM, R. Verlangen 
<ro...@us2.nl<mailto:ro...@us2.nl>> wrote:
Cassandra is built to store tons and tons of data. In my opinion roughly ~ 6MB 
per node is not enough data to allow it to become a fully balanced cluster.

2012/3/27 Tamar Fraenkel <ta...@tok-media.com<mailto:ta...@tok-media.com>>
This morning I have
 nodetool ring -h localhost
Address         DC          Rack        Status State   Load            Owns    
Token
                                                                               
113427455640312821154458202477256070485
10.34.158.33    us-east     1c          Up     Normal  5.78 MB         33.33%  0
10.38.175.131   us-east     1c          Up     Normal  7.23 MB         33.33%  
56713727820156410577229101238628035242
10.116.83.10    us-east     1c          Up     Normal  5.02 MB         33.33%  
113427455640312821154458202477256070485

Version is 1.0.8.


Tamar Fraenkel
Senior Software Engineer, TOK Media
[Inline image 1]

ta...@tok-media.com<mailto:ta...@tok-media.com>
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956




On Tue, Mar 27, 2012 at 4:05 AM, Maki Watanabe 
<watanabe.m...@gmail.com<mailto:watanabe.m...@gmail.com>> wrote:
What version are you using?
Anyway try nodetool repair & compact.

maki

2012/3/26 Tamar Fraenkel <ta...@tok-media.com<mailto:ta...@tok-media.com>>
Hi!
I created Amazon ring using datastax image and started filling the db.
The cluster seems un-balanced.

nodetool ring returns:
Address         DC          Rack        Status State   Load            Owns    
Token
                                                                               
113427455640312821154458202477256070485
10.34.158.33    us-east     1c          Up     Normal  514.29 KB       33.33%  0
10.38.175.131   us-east     1c          Up     Normal  1.5 MB          33.33%  
56713727820156410577229101238628035242
10.116.83.10    us-east     1c          Up     Normal  1.5 MB          33.33%  
113427455640312821154458202477256070485

[default@tok] describe;
Keyspace: tok:
  Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
  Durable Writes: true
    Options: [replication_factor:2]

[default@tok] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.Ec2Snitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
        4687d620-7664-11e1-0000-1bcb936807ff: [10.38.175.131, 10.34.158.33, 
10.116.83.10]


Any idea what is the cause?
I am running similar code on local ring and it is balanced.

How can I fix this?

Thanks,

Tamar Fraenkel
Senior Software Engineer, TOK Media
[Inline image 1]

ta...@tok-media.com<mailto:ta...@tok-media.com>
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956








--
With kind regards,

Robin Verlangen
www.robinverlangen.nl<http://www.robinverlangen.nl>








<<inline: image001.png>>

<<inline: signature-logo29.png>>

<<inline: iab4823.png>>

Reply via email to