Then the next step is to check StorageService.getRangeToEndpointMap via jmx
On Tue, Jun 1, 2010 at 11:56 AM, Ran Tavory <ran...@gmail.com> wrote: > I'm using RackAwareStrategy. But it still doesn't make sense I think... > let's see what did I miss... > According to http://wiki.apache.org/cassandra/Operations > > RackAwareStrategy: replica 2 is placed in the first node along the ring the > belongs in another data center than the first; the remaining N-2 replicas, > if any, are placed on the first nodes along the ring in the same rack as the > first > > 192.168.252.124Up 803.33 MB > 56713727820156410577229101238628035242 |<--| > 192.168.252.99Up 352.85 MB > 56713727820156410577229101238628035243 | ^ > 192.168.252.125Up 134.24 MB > 85070591730234615865843651857942052863 v | > 192.168.254.57Up 676.41 MB > 113427455640312821154458202477256070485 | ^ > 192.168.254.58Up 99.74 MB > 141784319550391026443072753096570088106 v | > 192.168.254.59Up 99.94 MB > 170141183460469231731687303715884105727 |-->| > Alright, so I made a mistake and didn't use the alternate-datacenter > suggestion on the page so the first node of every DC is overloaded with > replicas. However, the current situation still doesn't make sense to me. > .252.124 will be overloaded b/c it has the first token in the 252 dc. > .254.57 will also be overloaded since it has the first token in the .254 DC. > But for which node does 252.99 serve as a replicator? It's not the first in > the DC and it's just one single token more than it's predecessor (which is > in the same DC). > On Tue, Jun 1, 2010 at 4:00 PM, Jonathan Ellis <jbel...@gmail.com> wrote: >> >> I'm saying that .99 is getting a copy of all the data for which .124 >> is the primary. (If you are using RackUnawarePartitioner. If you are >> using RackAware it is some other node.) >> >> On Tue, Jun 1, 2010 at 1:25 AM, Ran Tavory <ran...@gmail.com> wrote: >> > ok, let me try and translate your answer ;) >> > Are you saying that the data that was left on the node is >> > non-primary-replicas of rows from the time before the move? >> > So this implies that when a node moves in the ring, it will affect >> > distribution of: >> > - new keys >> > - old keys primary node >> > -- but will not affect distribution of old keys non-primary replicas. >> > If so, still I don't understand something... I would expect even the >> > non-primary replicas of keys to be moved since if they don't, how would >> > they >> > be found? I mean upon reads the serving node should not care about >> > whether >> > the row is new or old, it should have a consistent and global mapping of >> > tokens. So I guess this ruins my theory... >> > What did you mean then? Is this deletions of non-primary replicated >> > data? >> > How does the replication factor affect the load on the moved host then? >> > >> > On Tue, Jun 1, 2010 at 1:19 AM, Jonathan Ellis <jbel...@gmail.com> >> > wrote: >> >> >> >> well, there you are then. >> >> >> >> On Mon, May 31, 2010 at 2:34 PM, Ran Tavory <ran...@gmail.com> wrote: >> >> > yes, replication factor = 2 >> >> > >> >> > On Mon, May 31, 2010 at 10:07 PM, Jonathan Ellis <jbel...@gmail.com> >> >> > wrote: >> >> >> >> >> >> you have replication factor > 1 ? >> >> >> >> >> >> On Mon, May 31, 2010 at 7:23 AM, Ran Tavory <ran...@gmail.com> >> >> >> wrote: >> >> >> > I hope I understand nodetool cleanup correctly - it should clean >> >> >> > up >> >> >> > all >> >> >> > data >> >> >> > that does not (currently) belong to this node. If so, I think it >> >> >> > might >> >> >> > not >> >> >> > be working correctly. >> >> >> > Look at nodes 192.168.252.124 and 192.168.252.99 below >> >> >> > 192.168.252.99Up 279.35 MB >> >> >> > 3544607988759775661076818827414252202 >> >> >> > |<--| >> >> >> > 192.168.252.124Up 167.23 MB >> >> >> > 56713727820156410577229101238628035242 | ^ >> >> >> > 192.168.252.125Up 82.91 MB >> >> >> > 85070591730234615865843651857942052863 v | >> >> >> > 192.168.254.57Up 366.6 MB >> >> >> > 113427455640312821154458202477256070485 | ^ >> >> >> > 192.168.254.58Up 88.44 MB >> >> >> > 141784319550391026443072753096570088106 v | >> >> >> > 192.168.254.59Up 88.45 MB >> >> >> > 170141183460469231731687303715884105727 |-->| >> >> >> > I wanted 124 to take all the load from 99. So I issued a move >> >> >> > command. >> >> >> > $ nodetool -h cass99 -p 9004 move >> >> >> > 56713727820156410577229101238628035243 >> >> >> > >> >> >> > This command tells 99 to take the space b/w >> >> >> > >> >> >> > >> >> >> > >> >> >> > (56713727820156410577229101238628035242, 56713727820156410577229101238628035243] >> >> >> > which is basically just one item in the token space, almost >> >> >> > nothing... I >> >> >> > wanted it to be very slim (just playing around). >> >> >> > So, next I get this: >> >> >> > 192.168.252.124Up 803.33 MB >> >> >> > 56713727820156410577229101238628035242 |<--| >> >> >> > 192.168.252.99Up 352.85 MB >> >> >> > 56713727820156410577229101238628035243 | ^ >> >> >> > 192.168.252.125Up 134.24 MB >> >> >> > 85070591730234615865843651857942052863 v | >> >> >> > 192.168.254.57Up 676.41 MB >> >> >> > 113427455640312821154458202477256070485 | ^ >> >> >> > 192.168.254.58Up 99.74 MB >> >> >> > 141784319550391026443072753096570088106 v | >> >> >> > 192.168.254.59Up 99.94 MB >> >> >> > 170141183460469231731687303715884105727 |-->| >> >> >> > The tokens are correct, but it seems that 99 still has a lot of >> >> >> > data. >> >> >> > Why? >> >> >> > OK, that might be b/c it didn't delete its moved data. >> >> >> > So next I issued a nodetool cleanup, which should have taken care >> >> >> > of >> >> >> > that. >> >> >> > Only that it didn't, the node 99 still has 352 MB of data. Why? >> >> >> > So, you know what, I waited for 1h. Still no good, data wasn't >> >> >> > cleaned >> >> >> > up. >> >> >> > I restarted the server. Still, data wasn't cleaned up... I issued >> >> >> > a >> >> >> > cleanup >> >> >> > again... still no good... what's up with this node? >> >> >> > >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Jonathan Ellis >> >> >> Project Chair, Apache Cassandra >> >> >> co-founder of Riptano, the source for professional Cassandra support >> >> >> http://riptano.com >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Jonathan Ellis >> >> Project Chair, Apache Cassandra >> >> co-founder of Riptano, the source for professional Cassandra support >> >> http://riptano.com >> > >> > >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of Riptano, the source for professional Cassandra support >> http://riptano.com > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com