Hi,
We're using Cassandra 2.0.10 ( 2 DC, 3 Nodes each RF=3 for each DC). During
one of the weekly repairs, we received the following error:
ERROR [ValidationExecutor:1280] 2015-07-12 22:18:10,992 Validator.java (line
242) Failed creating a merkle tree for [repair
#d2178ba0-2902-11e5-bd95-f14c61d86b85 on dmds/curve_dates,
(-1942303675502999131,-1890400428284965630]], / (see log for details)
ERROR [ValidationExecutor:1280] 2015-07-12 22:18:10,992 CassandraDaemon.java
(line 199) Exception in thread Thread[ValidationExecutor:1280,1,main]
FSWriteError in
/apps/data/cassandra/dmds/data/dmds/curve_dates/snapshots/d2178ba0-2902-11e5-bd95-f14c61d86b85
at
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:122)
at
org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:384)
at
org.apache.cassandra.db.Directories.clearSnapshot(Directories.java:488)
at
org.apache.cassandra.db.ColumnFamilyStore.clearSnapshot(ColumnFamilyStore.java:1877)
at
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:811)
at
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:63)
at
org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:398)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.file.DirectoryNotEmptyException:
/apps/data/cassandra/dmds/data/dmds/curve_dates/snapshots/d2178ba0-2902-11e5-bd95-f14c61d86b85
at
sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:242)
at
sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
at java.nio.file.Files.delete(Files.java:1126)
at
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:118)
... 10 more
ERROR [ValidationExecutor:1280] 2015-07-12 22:18:10,993 StorageService.java
(line 364) Stopping gossiper
WARN [ValidationExecutor:1280] 2015-07-12 22:18:10,993 StorageService.java
(line 278) Stopping gossip by operator request
INFO [ValidationExecutor:1280] 2015-07-12 22:18:10,993 Gossiper.java (line
1279) Announcing shutdown
Has anybody seen this error? The drives are local. Once this happened, the
other node performing repair maxed out the CPU and cluster became unresponsive.
Thanks,
dm
TD Securities disclaims any liability or losses either direct or consequential
caused by the use of this information. This communication is for informational
purposes only and is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation of any
transaction. TD Securities is neither making any investment recommendation nor
providing any professional or advisory services relating to the activities
described herein. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change without
notice Any products described herein are (i) not insured by the FDIC, (ii) not
a deposit or other obligation of, or guaranteed by, an insured depository
institution and (iii) subject to investment risks, including possible loss of
the principal amount invested. The information shall not be further distributed
or duplicated in whole or in part by any means without the prior written
consent of TD Securities. TD Securities is a trademark of The Toronto-Dominion
Bank and represents TD Securities (USA) LLC and certain investment banking
activities of The Toronto-Dominion Bank and its subsidiaries.