On Sun, May 1, 2011 at 2:16 PM, Jake Luciani <jak...@gmail.com> wrote:
> > > On Sun, May 1, 2011 at 2:58 PM, shimi <shim...@gmail.com> wrote: > >> On Sun, May 1, 2011 at 9:48 PM, Jake Luciani <jak...@gmail.com> wrote: >> >>> If you have N column families you need N * memtable size of RAM to >>> support this. If that's not an option you can merge them into one as you >>> suggest but then you will have much larger SSTables, slower compactions, >>> etc. >> >> >> >>> I don't necessarily agree with Tyler that the OS cache will be less >>> effective... But I do agree that if the sizes of sstables are too large for >>> you then more hardware is the solution... >> >> >> If you merge CFs which are hardly accessed with one which are accessed >> frequently, when you read the SSTable you load data that is hardly accessed >> to the OS cache. >> > > Only the rows or portions of rows you read will be loaded into the OS > cache. Just because different rows are in the same file doesn't mean the > entire file is loaded into the OS cache. The bloom filter and index file > will be loaded but those are not large files. > Right -- it does depend on the page size and the average amount of data read. The effect will be more pronounced on CFs with small rows that those with wide rows.