Thank you very much for you response. Follows my comments about your email.
Att. *Rodrigo Felix de Almeida* LSBD - Universidade Federal do CearĂ¡ Project Manager MBA, CSM, CSPO, SCJP On Mon, Jul 8, 2013 at 6:05 PM, Robert Coli <rc...@eventbrite.com> wrote: > On Sat, Jul 6, 2013 at 1:50 PM, Rodrigo Felix < > rodrigofelixdealme...@gmail.com> wrote: > >> >> - Is it normal to take about 9 minutes to add a new node? Follows the >> log generated by a script to add a new node. >> >> Sure. => OK > >> >> - Is there a way to reduce the time to start cassandra? >> >> Not usually. => OK > >> >> - Sometimes cleanup operation takes make minutes (about 10). Is this >> normal since the amount of data is small (1.7gb at maximum / seed)? >> >> Compaction is throttled, and cleanup is a type of compaction. Bootstrap > is also throttled via the streaming throttle. => OK > >> >> - Considering that I have two seeds in the beginning, their tokens >> are 0 and 85070591730234615865843651857942052864. When I add a new >> machine, >> do I need to execute move and cleanup on both seeds? Nowadays, I'm running >> cleanup on seed 0, move + cleanup on the other seed and neither move nor >> cleanup on the just added node. Is this OK? >> >> Only nodes which have "lost" ranges need to run cleanup. In general you > should add new nodes "between" other nodes such that "move" is not required > at all. > => Adding a new node between other nodes would avoid running move, but the ring would be unbalanced, right? Would this imply in having a node (with bigger range, 1/2 of the range while other 2 nodes have 1/2 each, supposing 3 nodes) overloaded? I'm refering http://wiki.apache.org/cassandra/Operations#Load_balancing > >> - What if I do not run cleanup in any existing node when adding or >> removing a node? Is the data that was not "cleaned up" still available if >> I >> send a scan, for instance, and the scan range is still in the node but it >> wouldn't be there if I had run cleanup? Data would be gather from other >> node, ie. the one that properly has the range specified in the scan query? >> >> If data for range [x] is on node [a] but node [a] is no longer considered > an endpoint for range [x], it will never receive a request to serve range > [x]. => OK > >> >> - After decommissioning a node, is it advisable to run cleanup in the >> remaining nodes? The consequences of not to run are the same of not to run >> when adding a node? >> >> Cleanup is only for the node which lost a range. In decommission case, no > live nodes lost a range, only some nodes gained one. => OK > > =Rob >