Some extra information you could provide which will help debug this: the
logs from those 3 nodes which have no data and the output of "nodetool ring"

Before seeing those I can only guess, but my guess would be that in the
logs on those 3 nodes you will see this: "Calculating new tokens" and this:
"Split previous range (blah, blah] into <long list of tokens>"

If that is the case then it means you accidentally started those three
nodes with the default configuration (single-token) and then subsequently
changed (num_tokens) and then joined them into the cluster. What happens
when you do this is that the node thinks it used to be responsible for a
single range and is being migrated to vnodes, so it splits its single range
(now a very small part of the keyspace) into 256 smaller ranges, and ends
up with just a tiny portion of the ring assigned to it.

To fix this you'll need to decommission those 3 nodes, remove all data from
them, then bootstrap them in again with the correct configuration from the
start.

Sam



On 26 April 2013 06:07, David McNelis <dmcne...@gmail.com> wrote:

> So, I had 7 nodes that I set up using vnodes, 256 tokens each, no problem.
>
> I added two 512 token nodes, no problem, things seemed to balance.
>
> The next 3 nodes I added, all at 256 tokens, and they have a cumulative
> load of 116mb (where as the other nodes are at ~100GB and ~200GB (256 and
> 512 respectively).
>
> Anyone else seen this is 1.2.4?
>
> The nodes seem to join the cluster ok, and I have num_tokens set and have
> tried both an empty initial_token and a commented out initial token, with
> no change.
>
> I see nothing streaming with netstats either, though these nodes were
> added days apart.  At first I thought I must have a hot key or something,
> but that doesn't seem to be the case, since the node I thought that one was
> on has evened out over the past couple of days with no new nodes added.
>
> I really *DON'T* want to deal with another shuffle....but what options do
> I have, since vnodes "make it unneeded to balance the cluster"?  (which, at
> the moment, seems like a load of bullshit).
>



-- 
Sam Overton
Acunu | http://www.acunu.com | @acunu

Reply via email to