On 01/11/2011 12:23 PM, Peter Schuller wrote:
>> Is this the intentional implementation?  Are there any reason not to
>> just the entire row to disk to allow for faster startup?
> 
> Intentional (in the sense of "not a mistake"), but see:
> 
>    https://issues.apache.org/jira/browse/CASSANDRA-1625
> 
> The reason your start-up took a lot of time is that reading in the
> values associated with the keys is entirely seek-bound (except in
> certain edge cases). Eliminating the need for seek-bound I/O to
> populate the row cache was the purpose of filing 1625.
> 

My reading of CASSANDRA-1625 is that the current proposal is to make the
"row cache" a CF and order CFs by hotness.   This sounds totally rad,
but not a near term change.


> In practice, you do have to consider the expected start-up time when
> sizing your row cache.

But now I need two knobs:  "Max size of row cache" (best optimal steady
state hit rate) and "number of row cache items to read in on startup"
(so that the ROW-READ-STAGE does not need to drop packets and node can
be restarted in a reasonable amount of time).

Choosing between a long period of dropped packets while the row cache
populates or 1 hour restart time per node is not fun.

Reply via email to