The histogram uses buckets, so it isn't exact (which would be much more expensive to record). And you are reading it the wrong way, you have 3M reads taking ~1.9ms (just like you don't have 1 read using 16k sstables. which would be a bit extreme).
On Wed, Jan 23, 2013 at 9:02 AM, Brian Tarbox <tar...@cabotresearch.com>wrote: > Wei, > Thank you for the explanation (Offset is always the x-axis, the other > columns represent the y-axis (taken 5 independent times)). > > Part of this still doesn't make sense. If I look at just read latencies > for example...am I to believe that 1916 times I had a latency of exactly > 3229500 usecs? Is this just some weird 5-independent variable mushed > together data bucketing??? > > Offset SSTables Write Lat Read Lat > 1109 0 349 642406 1331 0 147 1335840 1597 0 121 640374 *1916* 0 117 > *3229500* 2299 0 91 683749 2759 0 77 202722 > > > On Tue, Jan 22, 2013 at 12:11 PM, Wei Zhu <wz1...@yahoo.com> wrote: > >> I agree that Cassandra cfhistograms is probably the most bizarre metrics >> I have ever come across although it's extremely useful. >> >> I believe the offset is actually the metrics it has tracked (x-axis on >> the traditional histogram) and the number under each column is how many >> times that value has been recorded (y-axis on the traditional histogram). >> Your write latency are 17, 20, 24 (microseconds?). 3 writes took 17, 7 >> writes took 20 and 19 writes took 24 >> >> Correct me if I am wrong. >> >> Thanks. >> -Wei >> >> ------------------------------ >> *From:* Brian Tarbox <tar...@cabotresearch.com> >> *To:* user@cassandra.apache.org >> *Sent:* Tuesday, January 22, 2013 7:27 AM >> *Subject:* Re: Is this how to read the output of nodetool cfhistograms? >> >> Indeed, but how many Cassandra users have the good fortune to stumble >> across that page? Just saying that the explanation of the very powerful >> nodetool commands should be more front and center. >> >> Brian >> >> >> On Tue, Jan 22, 2013 at 10:03 AM, Edward Capriolo >> <edlinuxg...@gmail.com>wrote: >> >> This was described in good detail here: >> >> http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/ >> >> On Tue, Jan 22, 2013 at 9:41 AM, Brian Tarbox >> <tar...@cabotresearch.com>wrote: >> >> Thank you! Since this is a very non-standard way to display data it >> might be worth a better explanation in the various online documentation >> sets. >> >> Thank you again. >> >> Brian >> >> >> On Tue, Jan 22, 2013 at 9:19 AM, Mina Naguib <mina.nag...@adgear.com>wrote: >> >> >> >> On 2013-01-22, at 8:59 AM, Brian Tarbox <tar...@cabotresearch.com> wrote: >> >> > The output of this command seems to make no sense unless I think of it >> as 5 completely separate histograms that just happen to be displayed >> together. >> > >> > Using this example output should I read it as: my reads all took either >> 1 or 2 sstable. And separately, I had write latencies of 3,7,19. And >> separately I had read latencies of 2, 8,69, etc? >> > >> > In other words...each row isn't really a row...i.e. on those 16033 >> reads from a single SSTable I didn't have 0 write latency, 0 read latency, >> 0 row size and 0 column count. Is that right? >> >> Correct. A number in any of the metric columns is a count value bucketed >> in the offset on that row. There are no relationships between other >> columns on the same row. >> >> So your first row says "16033 reads were satisfied by 1 sstable". The >> other metrics (for example, latency of these reads) is reflected in the >> histogram under "Read Latency", under various other bucketed offsets. >> >> > >> > Offset SSTables Write Latency Read Latency Row >> Size Column Count >> > 1 16033 0 0 >> 0 0 >> > 2 303 0 0 >> 0 1 >> > 3 0 0 0 >> 0 0 >> > 4 0 0 0 >> 0 0 >> > 5 0 0 0 >> 0 0 >> > 6 0 0 0 >> 0 0 >> > 7 0 0 0 >> 0 0 >> > 8 0 0 2 >> 0 0 >> > 10 0 0 0 >> 0 6261 >> > 12 0 0 2 >> 0 117 >> > 14 0 0 8 >> 0 0 >> > 17 0 3 69 >> 0 255 >> > 20 0 7 163 >> 0 0 >> > 24 0 19 1369 >> 0 0 >> > >> >> >> >> >> >> >> > -- Derek Williams