There is some confusion in the ring about nodes leaving. Check nodetool ring 
from every node and see if they agree. Check the logs to see if there is any 
information about node is sending the wrong message. 

Without knowing much more you could  try a rolling restart, but you may need a 
full restart see 
http://www.datastax.com/docs/0.7/troubleshooting/index#view-of-ring-differs-between-some-nodes
 if the ring state is different. 

Hope that helps. 
 
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 21/08/2011, at 5:38 AM, Anand Somani wrote:

> 0.7.4/ 3 node cluster/ RF -3 /Quorum read/write
> 
> After I re-introduced a corrupted node, followed the process as (thanks to 
> folks on the mailing list for helping me) listed on the operations wiki to 
> handle failures.
> Still doing a cleanup on one node at this point. But I noticed that I am 
> seeing this same exception appear 10/12 times in a minute, on an existing 
> node (not the new one). I think it started around the removetoken.
> 
> How do I solve this, should I just restart this node? Any other 
> cleanups/resets I need to do?
> 
> Thanks
> 
> 
> On Thu, Apr 28, 2011 at 2:26 AM, aaron morton <aa...@thelastpickle.com> wrote:
> I *think* that code is used when one node tells others via gossip it is 
> removing a token that is not it's own. The ode that receives information in 
> gossip does some work and then replies to the first node with a 
> REPLICATION_FINISHED message, which is the node I assume the error is 
> happening on.
> 
> Have you been doing any moves / removes or additions or tokens/nodes?
> 
> Thanks
> Aaron
> 
> On 28 Apr 2011, at 08:39, Alexis Lê-Quôc wrote:
> 
> > Hi,
> >
> > I've been getting the following lately, every few seconds.
> >
> > 2011-04-27T20:21:18.299885+00:00 10.202.61.193 [MiscStage: 97] Error
> > in ThreadPoolExecutor
> > 2011-04-27T20:21:18.299885+00:00 10.202.61.193 java.lang.AssertionError
> > 2011-04-27T20:21:18.300038+00:00 10.202.61.193 10.202.61.193   at
> > org.apache.cassandra.service.StorageService.confirmReplication(StorageService.java:1872)
> > 2011-04-27T20:21:18.300038+00:00 10.202.61.193 10.202.61.193   at
> > org.apache.cassandra.streaming.ReplicationFinishedVerbHandler.doVerb(ReplicationFinishedVerbHandler.java:38)
> > 2011-04-27T20:21:18.300047+00:00 10.202.61.193 10.202.61.193   at
> > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
> > 2011-04-27T20:21:18.300047+00:00 10.202.61.193 10.202.61.193   at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> > 2011-04-27T20:21:18.300055+00:00 10.202.61.193 10.202.61.193   at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> > 2011-04-27T20:21:18.300055+00:00 10.202.61.193 10.202.61.193   at
> > java.lang.Thread.run(Thread.java:636)
> > 2011-04-27T20:21:18.300555+00:00 10.202.61.193 [MiscStage: 97] Fatal
> > exception in thread Thread[MiscStage:97,5,main]
> >
> > I see it coming from
> > 32 public class ReplicationFinishedVerbHandler implements IVerbHandler
> > 33 {
> > 34     private static Logger logger =
> > LoggerFactory.getLogger(ReplicationFinishedVerbHandler.class);
> > 35
> > 36     public void doVerb(Message msg, String id)
> > 37     {
> > 38         StorageService.instance.confirmReplication(msg.getFrom());
> > 39         Message response =
> > msg.getInternalReply(ArrayUtils.EMPTY_BYTE_ARRAY);
> > 40         if (logger.isDebugEnabled())
> > 41             logger.debug("Replying to " + id + "@" + msg.getFrom());
> > 42         MessagingService.instance().sendReply(response, id, 
> > msg.getFrom());
> > 43     }
> > 44 }
> >
> > Before I dig deeper in the code, has anybody dealt with this before?
> >
> > Thanks,
> >
> > --
> > Alexis Lê-Quôc
> 
> 

Reply via email to