Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
I notice that the documentation on the read path is quite compressed on this page: * http://wiki.apache.org/cassandra/ArchitectureOverview What is the best documentation of the read path? I'm also curious about the granularity and policies around caching. Paul Prescod

Re: Worst case #iops to read a row

2010-04-13 Thread Jonathan Ellis
On Tue, Apr 13, 2010 at 1:55 PM, Paul Prescod wrote: > What do you mean by "bad practice"? The document above implies that it > is nearly impossible. It implies that you will have between 1 and 4 > SSTables. Does the administrator have a choice in this matter? You can tune the 4 number via JMX (p

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
On Tue, Apr 13, 2010 at 12:00 PM, Benjamin Black wrote: >> I am probably being totally naive, but is the answer to the question >> "worst iops on read" just: >> >>  3 reads per SSTable * 4 SStables * ReplicationFactor ? >> >> = 3 * 4 * 3 = 36? >> > > Why does RF enter this? A simplistic model for

Re: Worst case #iops to read a row

2010-04-13 Thread Benjamin Black
On Tue, Apr 13, 2010 at 11:55 AM, Paul Prescod wrote: > > What do you mean by "bad practice"? The document above implies that it > is nearly impossible. It implies that you will have between 1 and 4 > SSTables. Does the administrator have a choice in this matter? > Hey, I am arguing the proposed

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
On Tue, Apr 13, 2010 at 11:52 AM, Scott White wrote: > >... > > Agreed. Kind of sorry to see Scott White and Benjamin Black being in agreementbut I guess that's the way yin and yang works. Opposition is illusory in any case. Paul Prescod

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
On Tue, Apr 13, 2010 at 11:31 AM, Benjamin Black wrote: > ... > How frequently do you want to write SSTables?  How much memory do you > want Memtables to consume?  How long do you want to wait between > Memtable flushes?  There is an entire wiki page on  Memtable tuning: > http://wiki.apache.org/c

Re: Worst case #iops to read a row

2010-04-13 Thread Scott White
> Do you understand you are assuming there have been no compactions, > which would be extremely bad practice given this number of SSTables? > A major compaction, as would be best practice given this volume, would > result in 1 SSTable per CF per node. One. Similarly, you are > assuming the update

Re: Worst case #iops to read a row

2010-04-13 Thread Benjamin Black
On Tue, Apr 13, 2010 at 11:31 AM, Paul Prescod wrote: > I am just checking math, not model. > > On Tue, Apr 13, 2010 at 10:48 AM, Time Less wrote: > >> >> numRowsOnNode = 10B / 20 = 500M. > > 50 million > 10B / 20 is 500M. The rest of the analysis from our pseudonymous friend remains faulty.

Re: Worst case #iops to read a row

2010-04-13 Thread Benjamin Black
On Tue, Apr 13, 2010 at 10:48 AM, Time Less wrote: > > >> > If I have 10B rows in my CF, and I can fit 10k rows per >> > SStable, and the SStables are spread across 5 nodes, and I have 1 bloom The error you are making is in thinking the Memtable thresholds are the SSTable limits. They are not.

Re: Worst case #iops to read a row

2010-04-13 Thread Paul Prescod
I am just checking math, not model. On Tue, Apr 13, 2010 at 10:48 AM, Time Less wrote: > > numRowsOnNode = 10B / 20 = 500M. 50 million > replicationFactor = 3. > rowsPerSStable = 128MB / 1K = 131k. > > Therefore worst-case iops per read on this cluster are: > (500M * 3 / 131k) * 3 = 150M / 131

Re: Worst case #iops to read a row

2010-04-13 Thread Time Less
> If I have 10B rows in my CF, and I can fit 10k rows per > > SStable, and the SStables are spread across 5 nodes, and I have 1 bloom > > filter false positive and 1 tombstone and ask the wrong node for the key, > > then: > > > > Mv = (((2B/10k)+1+1)*3)+1 == ((200,000)+2)*3+1 == 300,007 iops to rea

Re: Worst case #iops to read a row

2010-04-12 Thread Benjamin Black
On Mon, Apr 12, 2010 at 4:27 PM, Time Less wrote: > With this formula, we can already begin to formulate more useful answers to > the question. If I have 10B rows in my CF, and I can fit 10k rows per > SStable, and the SStables are spread across 5 nodes, and I have 1 bloom > filter false positive

Re: Worst case #iops to read a row

2010-04-12 Thread Time Less
> > What if we have 10B rows in the column family? What sort of index do you > use > > that would only require one iop to find the row index block? > > basically what is described in sections 5.3 and 5.4 here: > http://labs.google.com/papers/bigtable.html > Incorrect. Section 4 of the paper descri

Re: Worst case #iops to read a row

2010-04-12 Thread Jonathan Ellis
On Mon, Apr 12, 2010 at 3:45 PM, Time Less wrote: > I'm confused. That's really worst-case? 3 iops? max 3 per sstable, as RK clarified out. > What if we have 10B rows in the column family? What sort of index do you use > that would only require one iop to find the row index block? basically wha

Re: Worst case #iops to read a row

2010-04-12 Thread Time Less
> >> worst case is 2 or 3, depending on row size: > >> > >> one seek to read the right row index block > >> one seek to read the row header (bloom filter + column index) > >> if it's a big row, one seek to read the column block (block size is > >> configurable, default is 256KB) > > > > [This is al

Re: Worst case #iops to read a row

2010-04-10 Thread Scott Shealy
thanks , that is helpful S. - Original Message From: Jonathan Ellis To: user@cassandra.apache.org Sent: Fri, April 9, 2010 11:39:26 AM Subject: Re: Worst case #iops to read a row worst case is 2 or 3, depending on row size: one seek to read the right row index block one seek to read

Re: Worst case #iops to read a row

2010-04-09 Thread Jonathan Ellis
Right. On Fri, Apr 9, 2010 at 11:23 AM, Ryan King wrote: > On Fri, Apr 9, 2010 at 8:39 AM, Jonathan Ellis wrote: >> worst case is 2 or 3, depending on row size: >> >> one seek to read the right row index block >> one seek to read the row header (bloom filter + column index) >> if it's a big row,

Re: Worst case #iops to read a row

2010-04-09 Thread Ryan King
On Fri, Apr 9, 2010 at 8:39 AM, Jonathan Ellis wrote: > worst case is 2 or 3, depending on row size: > > one seek to read the right row index block > one seek to read the row header (bloom filter + column index) > if it's a big row, one seek to read the column block (block size is > configurable,

Re: Worst case #iops to read a row

2010-04-09 Thread Jonathan Ellis
worst case is 2 or 3, depending on row size: one seek to read the right row index block one seek to read the row header (bloom filter + column index) if it's a big row, one seek to read the column block (block size is configurable, default is 256KB) On Thu, Apr 8, 2010 at 5:21 PM, Scott Shealy w

Worst case #iops to read a row

2010-04-08 Thread Scott Shealy
Not knowing know anything about the physical layout of the data on disk or how it is accessed when it is read... Could someone who does help estimate the worst case scenario(no caching at any level) for the number of iops to read a row of modest size and modest number of columns in a large col