Dne 17.11.2011 17:42, Dan Hendry napsal(a):
What do you mean by ' better file offset caching'? Presumably you mean
'better page cache hit rate'?
fs metadata used to find blocks in smaller files are cached better. Large files are using indirect blocks and you need more reads to find correct block during seek syscall. For example if large file is using 3 indirect levels, you need 3xdisk seek to find correct block. http://computer-forensics.sans.org/blog/2008/12/24/understanding-indirect-blocks-in-unix-file-systems/ Metadata caching in OS is far worse then file caching - one "find /" will effectively nullify metadata cache.

If cassandra could use raw storage. it will eliminate fs overhead and it could be over 100% faster on reads because fragmentation will be an exception - no need to design fs like FAT or UFS where designers expects files to be stored in non continuous area on disk. Implementing something log based like - http://logfs.sourceforge.net/ will be enough. Cleaning will not be much needed because compaction will clean it naturally.

Perhaps what you are actually seeing is row fragmentation across your
SSTables? Easy to check with nodetool cfhistograms (SSTables column).
i have 1.5% hitrate to 2 sstables and 3% to hit 3 sstables. Its pretty low with min. compaction set to 5, i will probably set it to 6.

I would really like to see tests with user defined sizes and file counts used for tiered compaction because it work best if you do not leave largest file alone in bucket. If your data in cassandra are not growing, it can be better fine tuned. i havent done experiments with it but maybe max sstable size defined per cf will be enough. Lets say i have 5 GB data per CF - ideal setting will be max sstable size to slightly less then 1 GB. Cassandra will not keep old data stuck in one 4 GB compacted sstable waiting for other 4 GB sstables to be created before compaction will remove old data.

To answer your question, I know of no tools to split SSTables. If you want
to switch compaction strategies, levelled compaction (1.0.x) creates many
smaller sstables instead of fewer, bigger ones.
I dont use levelled compaction, it compacts too often. It might get better if it can be tuned how many and how large files to use at each level. But i will try to switch to levelled compaction and back again it might do what i want.

Reply via email to