On Mon, Apr 12, 2010 at 4:27 PM, Time Less <timelessn...@gmail.com> wrote: > With this formula, we can already begin to formulate more useful answers to > the question. If I have 10B rows in my CF, and I can fit 10k rows per > SStable, and the SStables are spread across 5 nodes, and I have 1 bloom > filter false positive and 1 tombstone and ask the wrong node for the key, > then: > > Mv = (((2B/10k)+1+1)*3)+1 == ((200,000)+2)*3+1 == 300,007 iops to read a > key. >
This is a nonsensical arrangement. Assuming each SSTable is the size of the default Memtable threshold (128MB), then each row is (128MB / 10k) == 12.8k and 10B rows == 128TB of raw data. A typical RF of 3 takes us to 384TB. The need for enough space for compactions takes us to 768TB. That's not 5 nodes, it's more like 100+, and almost 2 orders of magnitude off your estimate, without addressing shortcomings in the rest of it (which I leave to more capable folks on this list). b