Re: Tripling size of a cluster

Mariusz Dymarek Fri, 20 Jul 2012 03:28:02 -0700

On 20.07.2012 11:02, aaron morton wrote:

I would check for stored hints in /var/lib/cassandra/data/system

hmm where i can find this kind of info?
i can see HintsColumnFamily cf inside system kespace, but it`s empty...

Putting nodes in different racks can make placement tricky so…
Are you running a multi DC setup ?

No we have all nodes in one Data Center(all entries incassandra-topology.properties file contain same DC name

Are you using the NTS ?
no we`re using:
Replication Strategy: org.apache.cassandra.locator.SimpleStrategy

What is the RF setting ?

Options: [replication_factor:2]
 What setting do you have for the Snitch ?
endpoint_snitch: org.apache.cassandra.locator.PropertyFileSnitch

What is the full node assignments.

Address   DC    Rack    Status State   Load    Effective-Owership  Token
node01-01   iponly   rack1   Up   Normal  255.82 GB   6.67%   0

node02-07 iponly rack2 Up Normal 255.05 GB 6.67%5671372782015641057722910123862803524node01-07 iponly rack1 Up Normal 254.65 GB 6.67%11342745564031282115445820247725607048node02-01 iponly rack2 Up Normal 261.2 GB 6.67%17014118346046923173168730371588410572node01-08 iponly rack1 Up Normal 254.08 GB 6.67%22685491128062564230891640495451214097node02-08 iponly rack2 Up Normal 252.04 GB 6.67%28356863910078205288614550619314017621node01-02 iponly rack1 Up Normal 256.44 GB 6.67%34028236692093846346337460743176821144node02-09 iponly rack2 Up Normal 255.09 GB 6.67%39699609474109487404060370867039624669node01-09 iponly rack1 Up Normal 254.28 GB 6.67%45370982256125128461783280990902428194node02-02 iponly rack2 Up Normal 259.54 GB 6.67%51042355038140769519506191114765231716node01-010 iponly rack1 Up Normal 258.43 GB 6.67%56713727820156410577229101238628035242node02-010 iponly rack2 Up Normal 258.09 GB 6.67%62385100602172051634952011362490838766node01-03 iponly rack1 Up Normal 253.35 GB 6.67%68056473384187692692674921486353642288node02-11 iponly rack2 Up Normal 256.47 GB 6.67%73727846166203333750397831610216445815node01-11 iponly rack1 Up Normal 255.49 GB 6.67%79399218948218974808120741734079249339node02-03 iponly rack2 Up Normal 252.7 GB 6.67%85070591730234615865843651857942052860node01-012 iponly rack1 Up Normal 258.64 GB 6.67%90741964512250256923566561981804856388node02-012 iponly rack2 Up Normal 257.88 GB 6.67%96413337294265897981289472105667659912node01-05 iponly rack1 Up Normal 254.9 GB 6.67%102084710076281539039012382229530463432node02-013 iponly rack2 Up Normal 251.29 GB 6.67%107756082858297180096735292353393266961node01-013 iponly rack1 Up Normal 253.99 GB 6.67%113427455640312821154458202477256070485node02-05 iponly rack2 Up Normal 252.96 GB 6.67%119098828422328462212181112601118874004node01-14 iponly rack1 Up Normal 255.13 GB 6.67%124770201204344103269904022724981677533node02-14 iponly rack2 Up Normal 404.65 GB 6.67%130441573986359744327626932848844481058node01-06 iponly rack1 Up Normal 257.56 GB 6.67%136112946768375385385349842972707284576node02-015 iponly rack2 Up Normal 297.18 GB 6.67%141784319550391026443072753096570088106node01-15 iponly rack1 Up Normal 254.95 GB 6.67%147455692332406667500795663220432891630node02-06 iponly rack2 Up Normal 255.97 GB 6.67%153127065114422308558518573344295695148node01-16 iponly rack1 Up Normal 256.17 GB 6.67%158798437896437949616241483468158498679node02-16 iponly rack2 Up Normal 257.17 GB 6.67%164469810678453590673964393592021302203


Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 19/07/2012, at 6:00 PM, Mariusz Dymarek wrote:

Hi again,
we have now moved all nodes to correct position in ring, but we can
see higher load on 2 nodes, than on other nodes:
...
node01-05 rack1 Up Normal 244.65 GB 6,67%
102084710076281539039012382229530463432
node02-13 rack2 Up Normal 240.26 GB 6,67%
107756082858297180096735292353393266961
node01-13 rack1 Up Normal 243.75 GB 6,67%
113427455640312821154458202477256070485
node02-05 rack2 Up Normal 249.31 GB 6,67%
119098828422328462212181112601118874004
node01-14 rack1 Up Normal 244.95 GB 6,67%
124770201204344103269904022724981677533
node02-14 rack2 Up Normal 392.7 GB 6,67%
130441573986359744327626932848844481058
node01-06 rack1 Up Normal 249.3 GB 6,67%
136112946768375385385349842972707284576
node02-15 rack2 Up Normal 286.82 GB 6,67%
141784319550391026443072753096570088106
node01-15 rack1 Up Normal 245.21 GB 6,67%
147455692332406667500795663220432891630
node02-06 rack2 Up Normal 244.9 GB 6,67%
153127065114422308558518573344295695148
...

Node:
* node01-15 = > 286.82 GB
* node02-14 = > 392.7 GB

average load on all other nodes is around 245 GB, nodetool cleanup
command was invoked on problematic nodes after move operation...
Why this has happen?
And how can we balance cluster?
On 06.07.2012 20:15, aaron morton wrote:

If you have the time yes I would wait for the bootstrap to finish. It
will make you life easier.

good luck.


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/07/2012, at 7:12 PM, Mariusz Dymarek wrote:

Hi,
we`re in the middle of extending our cluster from 10 to 30 nodes,
we`re running cassandra 1.1.1...
We`ve generated initial tokens for new nodes:
"0": 0, # existing: node01-01
"1": 5671372782015641057722910123862803524, # new: node02-07
"2": 11342745564031282115445820247725607048, # new: node01-07
"3": 17014118346046923173168730371588410572, # existing: node02-01
"4": 22685491128062564230891640495451214097, # new: node01-08
"5": 28356863910078205288614550619314017621, # new: node02-08
"6": 34028236692093846346337460743176821145, # existing: node01-02
"7": 39699609474109487404060370867039624669, # new: node02-09
"8": 45370982256125128461783280990902428194, # new: node01-09
"9": 51042355038140769519506191114765231718, # existing: node02-02
"10": 56713727820156410577229101238628035242, # new: node01-10
"11": 62385100602172051634952011362490838766, # new: node02-10
"12": 68056473384187692692674921486353642291, # existing: node01-03
"13": 73727846166203333750397831610216445815, # new: node02-11
"14": 79399218948218974808120741734079249339, # new: node01-11
"15": 85070591730234615865843651857942052864, # existing: node02-03
"16": 90741964512250256923566561981804856388, # new: node01-12
"17": 96413337294265897981289472105667659912, # new: node02-12
"18": 102084710076281539039012382229530463436, # existing: node01-05
"19": 107756082858297180096735292353393266961, # new: node02-13
"20": 113427455640312821154458202477256070485, # new: node01-13
"21": 119098828422328462212181112601118874009, # existing: node02-05
"22": 124770201204344103269904022724981677533, # new: node01-14
"23": 130441573986359744327626932848844481058, # new: node02-14
"24": 136112946768375385385349842972707284582, # existing: node01-06
"25": 141784319550391026443072753096570088106, # new: node02-15
"26": 147455692332406667500795663220432891630, # new: node01-15
"27": 153127065114422308558518573344295695155, # existing: node02-06
"28": 158798437896437949616241483468158498679, # new: node01-16
"29": 164469810678453590673964393592021302203 # new: node02-16
then we`ve started to boostrap new nodes,
but due to copy and paste mistake:
* node node01-14 was started with
130441573986359744327626932848844481058 as initial token(so node01-14
has initial_token, what should belong to node02-14), it
should have 124770201204344103269904022724981677533 as initial_token
* node node02-14 was started with
136112946768375385385349842972707284582 as initial token, so it has
token from existing node01-06....

However we`ve used other program for generating previous
initial_tokens and actual token of node01-06 in ring is
136112946768375385385349842972707284576.
Summing up: we have currently this situation in ring:

node02-05 rack2 Up Normal 596.31 GB 6.67%
119098828422328462212181112601118874004
node01-14 rack1 Up Joining 242.92 KB 0.00%
130441573986359744327626932848844481058
node01-06 rack1 Up Normal 585.5 GB 13.33%
136112946768375385385349842972707284576
node02-14 rack2 Up Joining 113.17 KB 0.00%
136112946768375385385349842972707284582
node02-15 rack2 Up Joining 178.05 KB 0.00%
141784319550391026443072753096570088106
node01-15 rack1 Up Joining 191.7 GB 0.00%
147455692332406667500795663220432891630
node02-06 rack2 Up Normal 597.69 GB 20.00%
153127065114422308558518573344295695148


We would like to get back to our original configuration.
Is it safe to wait for finishing bootstraping of all new nodes and
after that invoke:
* nodetool -h node01-14 move 124770201204344103269904022724981677533
* nodetool -h node02-14 move 130441573986359744327626932848844481058
We should probably run nodetool cleanup on several nodes after that...
Regards
Dymarek Mariusz

Re: Tripling size of a cluster

Reply via email to