Do you see anything related to "merkle" tree in your log?

Also do a nodetool compactionstats, during merkle tree calculation, you will 
see validation there. 


-Wei
----- Original Message -----
From: "Dane Miller" <d...@optimalsocial.com>
To: user@cassandra.apache.org
Sent: Wednesday, March 13, 2013 10:54:50 AM
Subject: repair hangs

Hi,

On one of my nodes, nodetool repair -pr has been running for 48 hours
and appears to be hung, with no output and no AntiEntropy messages in
system.log for 40+ hours.  Load, cpu, etc are all near zero.  There
are no other repair jobs running in my cluster.

What's the recommended way to deal with a hung repair job?  Is it the
symptom of a larger problem?  More info follows...


On the node where the repair is running/hung, "nodetool tpstats" shows
1 Active and 1 Pending AntiEntropySessions.

"nodetool netstats" reports Not sending any streams.  Not receiving any streams.

I created this cluster by copying and restoring snapshots from another
cluster.  The new cluster has the same number of nodes and same tokens
as the original.  However, the rack assignment is different: the new
cluster uses a single rack, the original cluster uses multiple racks.
The replication strategy is SimpleStrategy for all keyspaces.


Details:
6 node cluster
cassandra  1.2.2
RandomPartitioner, EC2Snitch
Ubuntu 12.04 x86_64
EC2 m1.large


Thanks,
Dane

Reply via email to