We are running a 6-node AWS EC2 (m1.large) cluster of cassandra 1.2.9 across three availability zones with Ec2Snitch and NetworkTopologyStrategy.
One of our nodes was apparently sharing a physical box with another customer who was really hogging the IO. So we needed to bring the node up on a new ec2 instance. We decommissioned the offending node, killed the instance and brought a new instance into the cluster. Everything went fine so far. After it came up I ran a nodetool repair -pr on each of the nodes in the cluster. I ran these sequentially. When it got to doing the repair on the new node three times the gossip service shut down. At the bottom of this email is a copy of the stack trace we received. It says it couldn't create a backups directory. I have no idea why this would be the /data-1 partition is 400Gb in size and currently 1% utilized. Does anyone have any idea what could be causing this? my /etc/security/limits.conf file currently has # resource settings added based on # http://www.datastax.com/docs/1.2/install/recommended_settings * soft nofile 65536 * hard nofile 65536 root soft nofile 65536 root hard nofile 65536 * soft memlock unlimited * hard memlock unlimited root soft memlock unlimited root hard memlock unlimited * soft as unlimited * hard as unlimited root soft as unlimited root hard as unlimited ERROR 2013-12-02 21:02:25,711 [Thread-3050] CassandraDaemon Exception in thread Thread[Thread-3050,5,main] FSWriteError in /data-1/cassandra/data/SinglewireSupport/Binaries/backups at org.apache.cassandra.db.Directories.getOrCreate(Directories.java:483) at org.apache.cassandra.db.Directories.getBackupsDirectory(Directories.java:242) at org.apache.cassandra.db.DataTracker.maybeIncrementallyBackup(DataTracker.java:165) at org.apache.cassandra.db.DataTracker.addSSTables(DataTracker.java:237) at org.apache.cassandra.db.ColumnFamilyStore.addSSTables(ColumnFamilyStore.java:911) at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:186) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:138) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:238) at org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:178) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78) Caused by: java.io.IOException: Unable to create directory /data-1/cassandra/data/SinglewireSupport/Binaries/backups -- John Pyeatt Singlewire Software, LLC www.singlewire.com ------------------ 608.661.1184 john.pye...@singlewire.com