Data is in Memtables from writes before they get flushed (based on first threshold of ops/size/time exceeded; all are configurable) to SSTables on disk.
There is a keycache and a rowcache. The keycache caches offsets into SSTables for the rows. the rowcache caches the entire row. There is also the OS page cache which is heavily used. When a read happens, the keycache is updated with the information for the SSTables the row was eventually found in. If there are too many entries now in the keycache, some are ejected. Overall the keycache uses very little memory per entry and can cut your disk IO in half so it's a pretty big win. If you read an entire row it goes in the row cache. Like the keycache, this may result in older entries being ejected from the cache. If you insert lots of really large rows in the rowcache you can OOM your JVM. The rowcache is kept up to date with the memtables as writes come in. When a read comes in, C* will collect the data from the SSTables and Memtables and merge them together but data only goes into Memtables from writes. On Tue, Feb 22, 2011 at 3:32 AM, Viktor Jevdokimov <vjevdoki...@gmail.com>wrote: > Hello, > > Write path is perfectly documented in architecture overview. > > I need Reads to be clarified: > > How memory is used > 1. When data is in the Memtable > 2. When data is in the SSTable > > How cache is used alongside with Memtable? > > Are records created in the Memtable from writes only or from reads also? > > What I need to know is, how Cassandra uses memory and Memtables for reads? > > > Thenk you, > Viktor >