Re: getting status of long running repair

2012-05-09 Thread Vijay
Are you by using Broadcast Address? if yes then you might be affected by https://issues.apache.org/jira/browse/CASSANDRA-3503 >>> Nodes are all up while repairing is running. I should have been clear are you seeing the following messages in logs (UP/DOWN) during the period of the repair... INFO [

Re: getting status of long running repair

2012-05-09 Thread Bill Au
I am running 1.0.8. Two data center with 8 machines in each dc. Nodes are all up while repairing is running. No dropped Mutations/Messages. I do see HintedHandoff messages. Bill On Tue, May 8, 2012 at 11:15 PM, Vijay wrote: > What is the version you are using? is it Multi DC setup? Are you

Re: getting status of long running repair

2012-05-08 Thread Vijay
What is the version you are using? is it Multi DC setup? Are you seeing a lot of dropped Mutations/Messages? Are the nodes going up and down all the time while the repair is running? Regards, On Tue, May 8, 2012 at 2:05 PM, Bill Au wrote: > There are no error message in my log. > > I ended u

Re: getting status of long running repair

2012-05-08 Thread Bill Au
There are no error message in my log. I ended up restarting all the nodes in my cluster. After that I was able to run repair successfully on one of the node. It took about 40 minutes. Feeling lucky I ran repair on another node and it is stuck again. tpstats show 1 active and 1 pending AntiEntro

Re: getting status of long running repair

2012-05-08 Thread aaron morton
When you look in the logs please let me know if you see this error… https://issues.apache.org/jira/browse/CASSANDRA-4223 I look at nodetool compactionstats (for the Merkle tree phase), nodetool netstats for the streaming, and this to check for streaming progress: while true; do date; diff <(nod

Re: getting status of long running repair

2012-05-07 Thread Ben Coverston
Check the log files for warnings or errors. They may indicate why your repair failed. On Mon, May 7, 2012 at 10:09 AM, Bill Au wrote: > I restarted the nodes and then restarted the repair. It is still hanging > like before. Do I keep repeating until the repair actually finish? > > Bill > > > O

Re: getting status of long running repair

2012-05-07 Thread Bill Au
I restarted the nodes and then restarted the repair. It is still hanging like before. Do I keep repeating until the repair actually finish? Bill On Fri, May 4, 2012 at 2:18 PM, Rob Coli wrote: > On Fri, May 4, 2012 at 10:30 AM, Bill Au wrote: > > I know repair may take a long time to run. I

Re: getting status of long running repair

2012-05-04 Thread Rob Coli
On Fri, May 4, 2012 at 10:30 AM, Bill Au wrote: > I know repair may take a long time to run.  I am running repair on a node > with about 15 GB of data and it is taking more than 24 hours.  Is that > normal?  Is there any way to get status of the repair?  tpstats does show 2 > active and 2 pending

getting status of long running repair

2012-05-04 Thread Bill Au
I know repair may take a long time to run. I am running repair on a node with about 15 GB of data and it is taking more than 24 hours. Is that normal? Is there any way to get status of the repair? tpstats does show 2 active and 2 pending AntiEntropySessions. But netstats and compactionstats sh