Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
I notice that the documentation on the read path is quite compressed on this page: * http://wiki.apache.org/cassandra/ArchitectureOverview What is the best documentation of the read path? I'm also curious about the granularity and policies around caching. Paul Prescod

Re: Worst case #iops to read a row

2010-04-13 Thread Jonathan Ellis
On Tue, Apr 13, 2010 at 1:55 PM, Paul Prescod wrote: > What do you mean by "bad practice"? The document above implies that it > is nearly impossible. It implies that you will have between 1 and 4 > SSTables. Does the administrator have a choice in this matter? You can tune the 4 number via JMX (p

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
On Tue, Apr 13, 2010 at 12:00 PM, Benjamin Black wrote: >> I am probably being totally naive, but is the answer to the question >> "worst iops on read" just: >> >>  3 reads per SSTable * 4 SStables * ReplicationFactor ? >> >> = 3 * 4 * 3 = 36? >> > > Why does RF enter this? A simplistic model for

Re: Worst case #iops to read a row

2010-04-13 Thread Benjamin Black
On Tue, Apr 13, 2010 at 11:55 AM, Paul Prescod wrote: > > What do you mean by "bad practice"? The document above implies that it > is nearly impossible. It implies that you will have between 1 and 4 > SSTables. Does the administrator have a choice in this matter? > Hey, I am arguing the proposed

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
On Tue, Apr 13, 2010 at 11:52 AM, Scott White wrote: > >... > > Agreed. Kind of sorry to see Scott White and Benjamin Black being in agreementbut I guess that's the way yin and yang works. Opposition is illusory in any case. Paul Prescod

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
On Tue, Apr 13, 2010 at 11:31 AM, Benjamin Black wrote: > ... > How frequently do you want to write SSTables?  How much memory do you > want Memtables to consume?  How long do you want to wait between > Memtable flushes?  There is an entire wiki page on  Memtable tuning: > http://wiki.apache.org/c

Re: Worst case #iops to read a row

2010-04-13 Thread Scott White
> Do you understand you are assuming there have been no compactions, > which would be extremely bad practice given this number of SSTables? > A major compaction, as would be best practice given this volume, would > result in 1 SSTable per CF per node. One. Similarly, you are > assuming the update

Re: Worst case #iops to read a row

2010-04-13 Thread Benjamin Black
On Tue, Apr 13, 2010 at 11:31 AM, Paul Prescod wrote: > I am just checking math, not model. > > On Tue, Apr 13, 2010 at 10:48 AM, Time Less wrote: > >> >> numRowsOnNode = 10B / 20 = 500M. > > 50 million > 10B / 20 is 500M. The rest of the analysis from our pseudonymous friend remains faulty.

Re: Worst case #iops to read a row

2010-04-13 Thread Benjamin Black
On Tue, Apr 13, 2010 at 10:48 AM, Time Less wrote: > > >> > If I have 10B rows in my CF, and I can fit 10k rows per >> > SStable, and the SStables are spread across 5 nodes, and I have 1 bloom The error you are making is in thinking the Memtable thresholds are the SSTable limits. They are not.

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
I am just checking math, not model. On Tue, Apr 13, 2010 at 10:48 AM, Time Less wrote: > > numRowsOnNode = 10B / 20 = 500M. 50 million > replicationFactor = 3. > rowsPerSStable = 128MB / 1K = 131k. > > Therefore worst-case iops per read on this cluster are: > (500M * 3 / 131k) * 3 = 150M / 131

Re: Worst case #iops to read a row

2010-04-13 Thread Time Less
> If I have 10B rows in my CF, and I can fit 10k rows per > > SStable, and the SStables are spread across 5 nodes, and I have 1 bloom > > filter false positive and 1 tombstone and ask the wrong node for the key, > > then: > > > > Mv = (((2B/10k)+1+1)*3)+1 == ((200,000)+2)*3+1 == 300,007 iops to rea

Re: Worst case #iops to read a row

2010-04-12 Thread Benjamin Black
On Mon, Apr 12, 2010 at 4:27 PM, Time Less wrote: > With this formula, we can already begin to formulate more useful answers to > the question. If I have 10B rows in my CF, and I can fit 10k rows per > SStable, and the SStables are spread across 5 nodes, and I have 1 bloom > filter false positive

Re: Worst case #iops to read a row

2010-04-12 Thread Time Less
> > What if we have 10B rows in the column family? What sort of index do you > use > > that would only require one iop to find the row index block? > > basically what is described in sections 5.3 and 5.4 here: > http://labs.google.com/papers/bigtable.html > Incorrect. Section 4 of the paper descri

Re: Worst case #iops to read a row

2010-04-12 Thread Jonathan Ellis
On Mon, Apr 12, 2010 at 3:45 PM, Time Less wrote: > I'm confused. That's really worst-case? 3 iops? max 3 per sstable, as RK clarified out. > What if we have 10B rows in the column family? What sort of index do you use > that would only require one iop to find the row index block? basically wha

Re: Worst case #iops to read a row

2010-04-12 Thread Time Less
> >> worst case is 2 or 3, depending on row size: > >> > >> one seek to read the right row index block > >> one seek to read the row header (bloom filter + column index) > >> if it's a big row, one seek to read the column block (block size is > >> configurable, default is 256KB) > > > > [This is al

Re: Worst case #iops to read a row

2010-04-10 Thread Scott Shealy
thanks , that is helpful S. - Original Message From: Jonathan Ellis To: user@cassandra.apache.org Sent: Fri, April 9, 2010 11:39:26 AM Subject: Re: Worst case #iops to read a row worst case is 2 or 3, depending on row size: one seek to read the right row index block one seek to read

Re: Worst case #iops to read a row

2010-04-09 Thread Jonathan Ellis
Right. On Fri, Apr 9, 2010 at 11:23 AM, Ryan King wrote: > On Fri, Apr 9, 2010 at 8:39 AM, Jonathan Ellis wrote: >> worst case is 2 or 3, depending on row size: >> >> one seek to read the right row index block >> one seek to read the row header (bloom filter + column index) >> if it's a big row,

Re: Worst case #iops to read a row

2010-04-09 Thread Ryan King
On Fri, Apr 9, 2010 at 8:39 AM, Jonathan Ellis wrote: > worst case is 2 or 3, depending on row size: > > one seek to read the right row index block > one seek to read the row header (bloom filter + column index) > if it's a big row, one seek to read the column block (block size is > configurable,

Re: Worst case #iops to read a row

2010-04-09 Thread Jonathan Ellis
worst case is 2 or 3, depending on row size: one seek to read the right row index block one seek to read the row header (bloom filter + column index) if it's a big row, one seek to read the column block (block size is configurable, default is 256KB) On Thu, Apr 8, 2010 at 5:21 PM, Scott Shealy w