Aaron, What version are you on ?
1.1.5 Do you know how many rows were loaded ? INFO [OptionalTasks:1] 2012-11-19 13:08:58,868 ColumnFamilyStore.java (line 451) completed loading (5175655 ms; 13259976 keys) row cache In both cases I do not believe the cache is stored in token (or key) order. Am i getting this right: the row keys are read and rows are retrieved from SSTables in the order their keys are in the cache file.. Would something like iterating over SSTables instead, and throwing rows at the cache that need to be in there feasible ? If the SSTables themselves are written sequentially at compaction time , which is how i remember they are written, SSTable-sized sequential reads with a filter ( bloom filter for the row cache? :-) ) must be faster than reading from all across the column family ( i have HDDs and about 1k SSTables ) row_cache_keys_to_save in yaml may help you find a happy half way point. If i can keep that high enough, with my data retention requirements, save for the absolute first get on a row, i can operate entirely out of memory. thanks! Andras Andras Szerdahelyi Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A M: +32 493 05 50 88 | Skype: sandrew84 [cid:7BDF7228-D831-4D98-967A-BE04FEB17544] On 19 Nov 2012, at 22:00, aaron morton <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>> wrote: i was just wondering if anyone else is experiencing very slow ( ~ 3.5 MB/sec ) re-fill of the row cache at start up. It was mentioned the other day. What version are you on ? Do you know how many rows were loaded ? When complete it will log a message with the pattern "completed loading (%d ms; %d keys) row cache for %s.%s" How is the "saved row cache file" processed? In Version 1.1, after the SSTables have been opened the keys in the saved row cache are read one at a time and the whole row read into memory. This is a single threaded operation. In 1.2 reading the saved cache is still single threaded, but reading the rows goes through the read thread pool so is in parallel. In both cases I do not believe the cache is stored in token (or key) order. ( Admittedly whatever is going on is still much more preferable to starting with a cold row cache ) row_cache_keys_to_save in yaml may help you find a happy half way point. Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com<http://www.thelastpickle.com/> On 20/11/2012, at 3:17 AM, Andras Szerdahelyi <andras.szerdahe...@ignitionone.com<mailto:andras.szerdahe...@ignitionone.com>> wrote: Hey list, i was just wondering if anyone else is experiencing very slow ( ~ 3.5 MB/sec ) re-fill of the row cache at start up. We operate with a large row cache ( 10-15GB currently ) and we already measure startup times in hours :-) How is the "saved row cache file" processed? Are the cached row keys simply iterated over and their respective rows read from SSTables - possibly creating random reads with small enough sstable files, if the keys were not stored in a manner optimised for a quick re-fill ? - or is there a smarter algorithm ( i.e. scan through one sstable at a time, filter rows that should be in row cache ) at work and this operation is purely disk i/o bound ? ( Admittedly whatever is going on is still much more preferable to starting with a cold row cache ) thanks! Andras Andras Szerdahelyi Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A M: +32 493 05 50 88 | Skype: sandrew84 <C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>
<<inline: C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>>