Re: improving read performance

2010-09-21 Thread Zhu Han
> Reasons to not use the row cache with large rows include: > > * In general it's a waste of memory better given to the OS page cache, > unless possibly you're continually reading entire rows rather than > subsets of rows. > > * For truly large rows you may have immediate issues with the size of >

Re: improving read performance

2010-09-20 Thread Mohamed Ibrahim
Just in case some one uses the equations on that page, there is a small mathematical mistake. The exponent is missing a -ve sign, so the error rate is : ( 1 - exp(-kn/m) )^k . Mohamed On Mon, Sep 20, 2010 at 3:04 PM, Peter Schuller wrote: > > Actually, the points you make are things I have over

Re: improving read performance

2010-09-20 Thread Peter Schuller
> Actually, the points you make are things I have overlooked and actually make > me feel more comfortable about how cassandra will perform for my use cases. >   I'm interested, in my case, to find out what the bloom filter > false-positive rate is.   Hopefully, a stat is kept on this. Assuming lac

Re: improving read performance

2010-09-20 Thread Carl Bruecken
On 9/20/10 12:47 PM, Peter Schuller wrote: This drawback is unfortunate for systems that use time-based row keys.In such systems, row data will generally not be fragmented very much, if at all, but reads suffer because the assumption is that all data is fragmented. Even further, in a re

Re: improving read performance

2010-09-20 Thread Peter Schuller
> This drawback is unfortunate for systems that use time-based row keys.    In > such systems, row data will generally not be fragmented very much, if at > all, but reads suffer because the assumption is that all data is fragmented. >    Even further, in a real-time system where reads occur quickly

improving read performance

2010-09-20 Thread Carl Bruecken
The cassandra FAQ answers the question as to why reads are slower than writes as follows: http://wiki.apache.org/cassandra/FAQ#reads_slower_writes This drawback is unfortunate for systems that use time-based row keys.In such systems, row data will generally not be fragmented very much,