Hi there,

It was strange that the 'xxx-tmp-xxx.db' file kept increasing until
Cassandra throw exceptions with 'No space left on device'. I am using CQL 3
to create a table to store data about 200K ~ 500K per record. I have 6
harddisks per node and cassandra was configured with 6 data
directories(ext4 file systems, Centos 6.5):

data_file_directories:
>     - /data1/cass
>     - /data2/cass
>     - /data3/cass
>     - /data4/cass
>     - /data5/cass
>     - /data6/cass
>

And every directory is on a standalone disk. But I just found when the
error occurred:

[root@node5 images]# ll -hl
> total 3.6T
> drwxr-xr-x 4 root root 4.0K Jan 20 09:44 snapshots
> -rw-r--r-- 1 root root 456M Apr 30 13:42
> mydb-images-tmp-jb-91068-CompressionInfo.db
> -rw-r--r-- 1 root root 3.5T Apr 30 13:42 mydb-images-tmp-jb-91068-Data.db
> -rw-r--r-- 1 root root    0 Apr 30 13:42 mydb-images-tmp-jb-91068-Filter.db
> -rw-r--r-- 1 root root 2.0G Apr 30 13:42 mydb-images-tmp-jb-91068-Index.db
>

[root@node5 images]# df -hl
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        49G  7.5G   39G  17% /
tmpfs           7.8G     0  7.8G   0% /dev/shm
/dev/sda3       3.6T  1.3T  2.1T  38% /data1
/dev/sdb1       3.6T  1.4T  2.1T  39% /data2
/dev/sdc1       3.6T  466G  3.0T  14% /data3
/dev/sdd1       3.6T  1.3T  2.2T  38% /data4
/dev/sde1       3.6T  1.3T  2.2T  38% /data5
/dev/sdf1       3.6T  3.6T     0 100% /data6

*mydb-images-tmp-jb-91068-Data.db *almost occupied all the disk space (4T
harddisk with 3.6T actual usable size) and the error looks like:

INFO [FlushWriter:4174] 2014-05-04 05:15:15,744 Memtable.java (line 403)
> Completed flushing
> /data3/cass/system/compactions_in_progress/system-compactions_in_progress-jb-16942-Data.db
> (42 bytes) for commitlog position ReplayPosition(segmentId=1398900356204,
> position=25024609)
>  INFO [CompactionExecutor:3689] 2014-05-04 05:15:15,745
> CompactionTask.java (line 115) Compacting
> [SSTableReader(path='/data3/cass/system/compactions_in_progress/system-compactions_in_progress-jb-16940-Data.db'),
> SSTableReader(path='/data3/cass/system/compactions_in_progress/system-compactions_in_progress-jb-16942-Data.db'),
> SSTableReader(path='/data3/cass/system/compactions_in_progress/system-compactions_in_progress-jb-16941-Data.db'),
> SSTableReader(path='/data3/cass/system/compactions_in_progress/system-compactions_in_progress-jb-16939-Data.db')]
> ERROR [CompactionExecutor:1245] 2014-05-04 05:15:15,745
> CassandraDaemon.java (line 198) Exception in thread
> Thread[CompactionExecutor:1245,1,main]
> FSWriteError in /data2/cass/mydb/images/mydb-images-tmp-jb-92181-Filter.db
>         at
> org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.close(SSTableWriter.java:475)
>         at
> org.apache.cassandra.io.util.FileUtils.closeQuietly(FileUtils.java:212)
>         at
> org.apache.cassandra.io.sstable.SSTableWriter.abort(SSTableWriter.java:301)
>         at
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:209)
>         at
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>         at
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>         at
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>         at
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:197)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: No space left on device
>         at java.io.FileOutputStream.write(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:295)
>         at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
>         at
> org.apache.cassandra.utils.BloomFilterSerializer.serialize(BloomFilterSerializer.java:34)
>         at
> org.apache.cassandra.utils.Murmur3BloomFilter$Murmur3BloomFilterSerializer.serialize(Murmur3BloomFilter.java:44)
>         at
> org.apache.cassandra.utils.FilterFactory.serialize(FilterFactory.java:41)
>         at
> org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.close(SSTableWriter.java:468)
>         ... 13 more
> ERROR [CompactionExecutor:1245] 2014-05-04 05:15:15,800
> StorageService.java (line 367) Stopping gossiper
>  WARN [CompactionExecutor:1245] 2014-05-04 05:15:15,800
> StorageService.java (line 281) Stopping gossip by operator request
>  INFO [CompactionExecutor:1245] 2014-05-04 05:15:15,800 Gossiper.java
> (line 1271) Announcing shutdown
>


I have changed my table to "LeveledCompactionStrategy" to reduce the disk
size needed when compaction, with:

ALTER TABLE images WITH compaction = { 'class' :
> 'LeveledCompactionStrategy', 'sstable_size_in_mb' : '192' };
>

But the problem still exists: the file keep increasing, and after about 2
or 3 days cassandra will fail due to 'No space left on device' error.  If I
restart the node or using 'cleanup', it will resume to normal.

I don't know is it because my configuration or it's just a bug, so would
any one please help to solve this issue?

Thanks

Reply via email to