Hi, I'm facing some problems and if you could help on some of them I'd thank you. *Environment:* 2 seeds and 2 other nodes, all installed on m1.large EC2 instances. Each seed starts with about 1.7GB of data. Default cassandra configuration.
- Is it normal to take about 9 minutes to add a new node? Follows the log generated by a script to add a new node. [06/07/2013 20:07:53] Remove all data stored in the Cassandra node [06/07/2013 20:07:54] [OK] All data successfully removed [06/07/2013 20:07:54] Setting seeds on cassandra.yml [06/07/2013 20:07:54] [OK] seeds successfully set [06/07/2013 20:07:54] Setting listen_address on cassandra.yml [06/07/2013 20:07:54] [OK] listen_address successfully set [06/07/2013 20:07:54] Setting initial_token on cassandra.yml [06/07/2013 20:07:54] [OK] initial_token successfully set *[06/07/2013 20:07:54] Starting cassandra...* *[06/07/2013 20:16:36] [OK] Cassandra started* [06/07/2013 20:16:37] Changing token of i-5cfc082f [06/07/2013 20:18:00] [OK] Token of i-5cfc082f successfully set to 56713727820156410577229101238628035242 [06/07/2013 20:18:00] Cleaning up i-5cfc082f [06/07/2013 20:20:13] Clean up of i-5cfc082f successfully finished [06/07/2013 20:20:13] Machine added - Is there a way to reduce the time to start cassandra? - Sometimes cleanup operation takes make minutes (about 10). Is this normal since the amount of data is small (1.7gb at maximum / seed)? - Considering that I have two seeds in the beginning, their tokens are 0 and 85070591730234615865843651857942052864. When I add a new machine, do I need to execute move and cleanup on both seeds? Nowadays, I'm running cleanup on seed 0, move + cleanup on the other seed and neither move nor cleanup on the just added node. Is this OK? - What if I do not run cleanup in any existing node when adding or removing a node? Is the data that was not "cleaned up" still available if I send a scan, for instance, and the scan range is still in the node but it wouldn't be there if I had run cleanup? Data would be gather from other node, ie. the one that properly has the range specified in the scan query? - After decommissioning a node, is it advisable to run cleanup in the remaining nodes? The consequences of not to run are the same of not to run when adding a node? Thank you very much in advance. Att. *Rodrigo Felix de Almeida* LSBD - Universidade Federal do CearĂ¡ Project Manager MBA, CSM, CSPO, SCJP