On 9/20/10 12:47 PM, Peter Schuller wrote:
This drawback is unfortunate for systems that use time-based row keys. In
such systems, row data will generally not be fragmented very much, if at
all, but reads suffer because the assumption is that all data is fragmented.
Even further, in a real-time system where reads occur quickly after
writes, if the data is in memory, the sstables are still checked.
Perhaps I am misunderstanding you, but why is this a problem (in the
particular case of time based row keys) given that existence of the
bloom filters which should eliminate the need to go down to the
sstables to any extent more than that they actually contain data for
the row (in almost all cases, subject to bloom filter false
positives)?
Also, for the case of the edges where memtables are flushed, a
write-through row cache should help alleviate that. I forget off hand
whether the row cache is in fact write-through or not though.
Hi
Actually, the points you make are things I have overlooked and actually
make me feel more comfortable about how cassandra will perform for my
use cases. I'm interested, in my case, to find out what the bloom
filter false-positive rate is. Hopefully, a stat is kept on this. As
long as ALL of the bloom filters are in memory, the hit should be
minimal for a false positive, since the index read should subsequently
reveal the row to not be in the correspending SSTABLE.
Good point on the row cache. I had actually misread the comments in
the yaml, mistaking "do not use on ColumnFamilies with LARGE ROWS" , as
"do not use on ColumnFamilies with a LARGE NUMBER OF ROWS". I don't
know if this will improve performance much since I don't understand yet
if this eliminates the need to check for the data in the SStables. If
it doesn't then what is the point of the row cache since the data is
also in an in-memory memtable?
That aside, splitting the memtable in 2, could make checking the bloom
filters unnecessary in most cases for me, but I'm not sure it's worth
the effort.