Greetings, 

I won't bore everyone with details here: the short story is I ran "riak-admin 
cluster leave/plan/commit" to remove a node and got a lot of grief from our 
five-node ring. 

The ring was pretty well de-stabilized. One-or-more nodes would be down, then 
up, when repeatedly running "riak-admin ring-status". 

I have finally isolated a wildly misbehaving node (not the one I was trying to 
make "leave", by the way). 

None of the existing metrics I was graphing highlighted a problem, so I went 
through "/stats" (yet again), looking at the undocumented metrics to see what 
looked interested. 

I noticed that riak_kv_vnodeq_total was showing up with a non zero-value, so I 
set up a graph which plots the difference between the previous-and-current 
value (like I do for the other "*_total" metrics). 

The results were *very* interesting! The other four nodes showed occasional 
values of 1, 2 even 3 once or twice. Our troublesome node showed 152, 8000, 
704... !! 

Does anyone know what riak_kv_vnodeq_total indicates? 

Thanks! 


-- 
Dave Brady 

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to