Thanks for the reply. The snapshots are only 400MB. Also, the disk usage on
the node should have reported the snapshot size as well in it (df -h).

I did notice that the spike in nodetool status load seems to coincide with
the hourly operation of "IndexSummaryManager.java:256 - Redistributing
index summaries". Any correlation here?

And the last night's run of periodic "nodetool repair -pr" succeeded on
only 2 of the 6 nodes.

On Fri, Apr 15, 2016 at 12:28 AM, Jan Kesten <j.kes...@enercast.de> wrote:

> Hi,
>
> you should check the "snapshot" directories on your nodes - it is very
> likely there are some old ones from failed operations taking up some space.
>
>
> Am 15.04.2016 um 01:28 schrieb kavya:
>
>> Hi,
>>
>> We are running a 6 node cassandra 2.2.4 cluster and we are seeing a spike
>> in the disk Load as per the ‘nodetool status’ command that does not
>> correspond with the actual disk usage. Load reported by nodetool was as
>> high as 3 times actual disk usage on certain nodes.
>> We noticed that the periodic repair failed with below error on running
>> the command : ’nodetool repair -pr’
>>
>> ERROR [RepairJobTask:2] 2016-04-12 15:46:29,902 RepairRunnable.java:243 -
>> Repair session 64b54d50-0100-11e6-b46e-a511fd37b526 for range
>> (-3814318684016904396,-3810689996127667017] failed with error [….]
>> Validation failed in /<ip>
>> org.apache.cassandra.exceptions.RepairException: [….] Validation failed
>> in <ip>
>>     at
>> org.apache.cassandra.repair.ValidationTask.treeReceived(ValidationTask.java:64)
>> ~[apache-cassandra-2.2.4.jar:2.2.4]
>>     at
>> org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:183)
>> ~[apache-cassandra-2.2.4.jar:2.2.4]
>>     at
>> org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:410)
>> ~[apache-cassandra-2.2.4.jar:2.2.4]
>>     at
>> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:163)
>> ~[apache-cassandra-2.2.4.jar:2.2.4]
>>     at 
>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
>> ~[apache-cassandra-2.2.4.jar:2.2.4]
>>     at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> [na:1.8.0_40]
>>     at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [na:1.8.0_40]
>>     at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40
>>
>> We restarted all nodes in the cluster and ran a full repair which
>> completed successfully without any validation errors, however we still see
>> Load spike on the same nodes after a while. Please advice.
>>
>> Thanks!
>>
>>
>

Reply via email to