On 7/14/2010 7:16 AM, Peter Schuller wrote:
More than one fd can be open on a given file, and many of open fd's are
on files that have been deleted.  The stale fd's are all on Data.db files in
the
data directory, which I have separate from the commit log directory.

I haven't had a chance to look at the code handling files, and I am not any
sort of Java expert, but could this be due to Java's lazy resource clean up?
I wonder if when considering writing your own file handling classes for
O_DIRECT or posix_fadvise or whatever, an explicit close(2) might help.
The fact that there are open fds to deleted files is interesting... I
wonder if people have reported weird disk space usage in the past
(since such deleted files would not show up with 'du -sh' but eat
space on the device until closed).

My general understanding is that Cassandra does specifically rely on
the GC to know when unused sstables can be removed. However the fact
that the files are deleted I think means that this is not the problem,
and the question is rather why open file descriptors/streams are
leaking to these deleted sstables. But I'm speaking now without
knowing when/where streams are closed.

Are the deleted files indeed sstable, or was that a bad assumption on my part?

As a Cassandra newbie, I'm not sure how to tell, but they are all
to *.Data.db files, and all under the DataFileDirectory (as spec'ed
in storage-conf.xml), which is a separate directory than the
CommitLogDirectory.  I did not see any *Index.db or *Filter.db
files, but I may have missed them.

Reply via email to