> > 86GB in commitlog and 42GB in data > > Whoa, that seems really wrong, particularly given your data spans 13 months. > Have you changed any of the default cassandra.yaml setting? What is the > maximum memtable_flush_after across all your CFs? Any warnings/errors in the > Cassandra log? >
It seems wrong to me too. It got so bad that /var/lib/cassandra looked like
this:
$ du -hs ./*
122G ./commitlog
55G ./data
17M ./saved_caches
I restarted cassandra, and it took a while to chew through all the commitlog
files, then disk utilization was like so:
du -hs ./*
1.1M ./commitlog
56G ./data
17M ./saved_caches
This isn't with 13 months of data, only with a couple months of data.
Upon going through the cassandra logs, I saw a ton of "too many open files"
warnings:
WARN [Thread-4] 2011-08-30 12:07:27,601 CustomTThreadPoolServer.java (line
112) Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: java.net.SocketException: Too
many open files
at
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:118)
at
org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:68)
at
org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:39)
at
org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
at
org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:102)
at
org.apache.cassandra.thrift.CassandraDaemon$ThriftServer.run(CassandraDaemon.java:198)
Caused by: java.net.SocketException: Too many open files
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
at java.net.ServerSocket.implAccept(ServerSocket.java:462)
at java.net.ServerSocket.accept(ServerSocket.java:430)
at
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:113)
I guess I should set the number of allowed files to some big number with
ulimit. Anyone have a suggestion for how big? I was thinking ulimit -n 10000,
but first I'm going to try to reproduce the "too many files open" condition and
then have a look at lsof to see just how many files are really open.
On a side note, why does cassandra seem to log to /var/log/cassandra.log no
matter what's in log4j.properties? I ended up having to link that to /dev/null
to keep from filling up my root partition with cassandra logs that I already
have elsewhere on another filesystem.
-Derek
smime.p7s
Description: S/MIME cryptographic signature
