> I also dont understand if all these nodes are replicas of each other why is > that the first node has almost double the data. Have you performed any token moves ? Old data is not deleted unless you run nodetool cleanup. Another possibility is things like a lot of hints. Admittedly it would have to be a *lot* of hints. The third is that compaction has fallen behind.
> This week its even worse, the nodetool repair has been running for the last > 15 hours just on the first node and when I run nodetool compactionstats I > constantly see this - > > pending tasks: 3 First check the logs for errors. Repair will first calculate the differences, you can see this as a validation compaction in nodetool compactionstats. Then it will stream the data, you can watch that with nodetool netstats. Try to work out which part is taking the most time. 15 hours for 50Gb sounds like a long time (btw do you have compaction on ?) Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/05/2012, at 3:14 AM, Raj N wrote: > Hi experts, > > I have a 6 node cluster spread across 2 DCs. > > DC Rack Status State Load Owns Token > > 113427455640312814857969558651062452225 > DC1 RAC13 Up Normal 95.98 GB 33.33% 0 > DC2 RAC5 Up Normal 50.79 GB 0.00% 1 > DC1 RAC18 Up Normal 50.83 GB 33.33% > 56713727820156407428984779325531226112 > DC2 RAC7 Up Normal 50.74 GB 0.00% > 56713727820156407428984779325531226113 > DC1 RAC19 Up Normal 61.72 GB 33.33% > 113427455640312814857969558651062452224 > DC2 RAC9 Up Normal 50.83 GB 0.00% > 113427455640312814857969558651062452225 > > They are all replicas of each other. All reads and writes are done at > LOCAL_QUORUM. We are on Cassandra 0.8.4. I see that our weekend nodetool > repair runs for more than 12 hours. Especially on the first one which has 96 > GB data. Is this usual? We are using 500 GB SAS drives with ext4 file system. > This gets worse every week. This week its even worse, the nodetool repair has > been running for the last 15 hours just on the first node and when I run > nodetool compactionstats I constantly see this - > > pending tasks: 3 > > and nothing else. Looks like its just stuck. There's nothing substantial in > the logs as well. I also dont understand if all these nodes are replicas of > each other why is that the first node has almost double the data. Any help > will be really appreciated. > > Thanks > -Raj