Glad to report I fixed this problem.
1. I added the load_ring_state=false flag
2. I was able to arrange a time where I could take down the whole
cluster and bring it back up.
After that the "phantom" node disappeared.
On Fri, May 27, 2011 at 12:48 AM, wrote:
> Hi Aaron - Thanks alot for the gre
Hi Aaron - Thanks alot for the great feedback. I'll try your suggestion on
removing it as an endpoint with jmx.
On , aaron morton wrote:
Off the top of my head the simple way to stop invalid end point state
been passed around is a full cluster stop. Obviously thats not an option.
The probl
Off the top of my head the simple way to stop invalid end point state been
passed around is a full cluster stop. Obviously thats not an option. The
problem is if one node has the IP is will share it around with the others.
Out of interest take a look at the o.a.c.db.FailureDetector MBean
getA
@Aaron -
Unfortunately I'm still seeing message like: " is down",
removing from gossip, although with not the same frequency.
And repair/move jobs don't seem to try to stream data to the removed node
anymore.
Anyone know how to totally purge any stored gossip/endpoint data on nodes that
we
cool. I was going to suggest that but as you already had the move running I
thought it may be a little drastic.
Did it show any progress ? If the IP address is not responding there should
have been some sort of error.
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaro
Seems like it had something to do with stale endpoint information. I did a
rolling restart of the whole cluster and that seemed to trigger the nodes
to remove the node that was decommissioned.
On , aaron morton wrote:
Is it showing progress ? It may just be a problem with the information
p
Is it showing progress ? It may just be a problem with the information printed
out.
Can you check from the other nodes in the cluster to see if they are receiving
the stream ?
cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 26