The optimization definitely shaved off some time. Now it is running about 3x
CFSTATS reported time. Below are the logs.

There is a ~300ms time frame after the last ResponseVerbHandler prior to the
resolver starting. Based on a quorum read the response resolver should kick
after 2 reads come in correct? That would mean it waited 500ms before
starting. Where is this time going?

There is still the 3+ second delay between the last 2 entries. Is this the
thrift server?

My expectation is that the client should get a quorum read at 2nd fastest
CFSTATS reported time x 1.xx. It should take the most time to read the data
from the node and after that it is all in memory and just needs to be
processed. There are some gaps in the timing of the cassandra qurum reader
below, but the lowest hanging fruit seems to be what happens between
quorumResponseHandler & "logout complete". Excluding that last step node 208
(2nd fastest) reported 2027ms in cfstats which as a % of 3320 ms = 1.63x.


DEBUG [pool-1-thread-2] 2010-10-15 18:59:34,894 CassandraServer.java (line
216) get_slice
DEBUG [pool-1-thread-2] 2010-10-15 18:59:34,895 StorageProxy.java (line 471)
strongread reading data for SliceFromReadCommand(table='table', key='key1',
column_parent='QueryPath(columnFamilyName='fact', superColumnName='null',
columnName='null')', start='503a', finish='503a7c', reversed=false,
count=10000000) from 385@/x.x.x.6
DEBUG [pool-1-thread-2] 2010-10-15 18:59:34,895 StorageProxy.java (line 471)
strongread reading digest for SliceFromReadCommand(table='table',
key='key1', column_parent='QueryPath(columnFamilyName='fact',
superColumnName='null', columnName='null')', start='503a', finish='503a7c',
reversed=false, count=10000000) from 386@/x.x.x.7
DEBUG [pool-1-thread-2] 2010-10-15 18:59:34,896 StorageProxy.java (line 471)
strongread reading digest for SliceFromReadCommand(table='table',
key='key1', column_parent='QueryPath(columnFamilyName='fact',
superColumnName='null', columnName='null')', start='503a', finish='503a7c',
reversed=false, count=10000000) from 386@/x.x.x.8
DEBUG [RESPONSE-STAGE:4] 2010-10-15 18:59:36,824 ResponseVerbHandler.java
(line 42) Processing response on a callback from
2DDDC24C-DB98-1D0F-E945-1DBF8EA99248@/10.4.0.206
DEBUG [RESPONSE-STAGE:5] 2010-10-15 18:59:37,115 ResponseVerbHandler.java
(line 42) Processing response on a callback from
2DDDC24C-DB98-1D0F-E945-1DBF8EA99248@/10.4.0.208
DEBUG [RESPONSE-STAGE:6] 2010-10-15 18:59:37,302 ResponseVerbHandler.java
(line 42) Processing response on a callback from
2DDDC24C-DB98-1D0F-E945-1DBF8EA99248@/10.4.0.207
DEBUG [pool-1-thread-2] 2010-10-15 18:59:37,585 ReadResponseResolver.java
(line 72) resolving 3 responses
DEBUG [pool-1-thread-2] 2010-10-15 18:59:38,065 ReadResponseResolver.java
(line 106) responses deserialized
DEBUG [pool-1-thread-2] 2010-10-15 18:59:38,216 ReadResponseResolver.java
(line 122) digests verified
DEBUG [pool-1-thread-2] 2010-10-15 18:59:38,216 ReadResponseResolver.java
(line 139) resolve: 630 ms.
DEBUG [pool-1-thread-2] 2010-10-15 18:59:38,217 StorageProxy.java (line 494)
quorumResponseHandler: 3320 ms.
DEBUG [pool-1-thread-2] 2010-10-15 18:59:41,556 CassandraServer.java (line
664) logout complete


On Thu, Oct 14, 2010 at 7:16 PM, Jonathan Ellis <jbel...@gmail.com> wrote:

> I bet it's the deserializing of responses into Row objects that is
> taking most of the time.  (This would also fit the resolve time
> growing proportionately with the cfstats time -- larger results take
> longer to serialize/deserialize.)
>
> Attached is a patch that adds extra debug information to verify this.
> If you can apply it (patch -p0 < resolve.txt against 0.6.6, should
> also work against 0.6.5) and let us know the results, that would be
> great.
>
> It also adds an optimization to avoid cloning the result in the common
> case where digests match (which is the case in your logs, or you would
> see "digest mismatch" at debug level).
>
> On Thu, Oct 14, 2010 at 3:18 PM, Wayne <wav...@gmail.com> wrote:
> > Can someone help to determine the anatomy of a quorum read? We are trying
> to
> > understand why CFSTATS reports one time and the client actual gets data
> back
> > almost 4x slower. Below are the debug logs from a read that all 3 nodes
> > reported < 2.5secs response time in cfstats but the client did not get
> data
> > for almost 10 seconds seconds. Where is the other 7+ seconds going? We
> are
> > using 0.6.5 with rf=3.
> >
> > We have verified that the difference between get_slice and "logout
> complete"
> > is the actual client wait time.
> >
> > The slowest node to respond returned data at 19:26:19,303. That is 2.618
> > seconds. CFSTATS reports a faster time by a few hundred nanoseconds, and
> I
> > assume the difference is network latency?
> >
> > Why does ReadResponseResolver take almost 3 seconds? That is longer that
> > actually doing the read. This is a wide row, does this prove Cassandra is
> > not very efficient reading wide rows at least in 0.6.5? Does this prove a
> > quorum read is more expensive in terms of making sure the copies match
> than
> > it is to actually read the data (not good news for those of us that often
> > need the C in CAP)?
> >
> > Why is the time stamp for the ReadResponseResolver more than 3.5 seconds
> > after the last read came in when it says it took less than three seconds
> for
> > the resolver? Where did the other 500ms go?
> >
> > I assume the quorumResponseHandler time is the total read time in terms
> of
> > Cassandra. Why does it take another 3+ seconds to get it to the client?
> Is
> > this the thrift server? Some of the time is probably for network latency,
> > but not anything close to 3 seconds. The "logout complete" time is the
> > actual time the client got the data.
> >
> > Any help trying to map out exactly what is going on would be much
> > appreciated.
> >
> > Thanks.
> >
> >
> > DEBUG [pool-1-thread-11] 2010-10-14 19:26:16,684 CassandraServer.java
> (line
> > 216) get_slice
> > DEBUG [pool-1-thread-11] 2010-10-14 19:26:16,684 StorageProxy.java (line
> > 474) strongread reading data for SliceFromReadCommand(table='xx',
> > key='row_key', column_parent='QueryPath(columnFamilyName='fact',
> > superColumnName='null', columnName='null')', start='503a',
> finish='503a7c',
> > reversed=false, count=10000000) from 7952@/x.x.x.6
> > DEBUG [pool-1-thread-11] 2010-10-14 19:26:16,685 StorageProxy.java (line
> > 474) strongread reading digest for SliceFromReadCommand(table='xx',
> > key='row_key', column_parent='QueryPath(columnFamilyName='fact',
> > superColumnName='null', columnName='null')', start='503a',
> finish='503a7c',
> > reversed=false, count=10000000) from 7953@/x.x.x.7
> > DEBUG [pool-1-thread-11] 2010-10-14 19:26:16,685 StorageProxy.java (line
> > 474) strongread reading digest for SliceFromReadCommand(table='xx',
> > key='row_key', column_parent='QueryPath(columnFamilyName='fact',
> > superColumnName='null', columnName='null')', start='503a',
> finish='503a7c',
> > reversed=false, count=10000000) from 7953@/x.x.x.8
> > DEBUG [RESPONSE-STAGE:15] 2010-10-14 19:26:19,090
> ResponseVerbHandler.java
> > (line 42) Processing response on a callback from
> > 7EBDA5E1-AB56-EA3C-82C4-25F0F7260468@/x.x.x.8
> > DEBUG [RESPONSE-STAGE:16] 2010-10-14 19:26:19,224
> ResponseVerbHandler.java
> > (line 42) Processing response on a callback from
> > 7EBDA5E1-AB56-EA3C-82C4-25F0F7260468@/x.x.x.6
> > DEBUG [RESPONSE-STAGE:1] 2010-10-14 19:26:19,303 ResponseVerbHandler.java
> > (line 42) Processing response on a callback from
> > 7EBDA5E1-AB56-EA3C-82C4-25F0F7260468@/x.x.x.7
> > DEBUG [GC inspection] 2010-10-14 19:26:19,776 GCInspector.java (line 131)
> GC
> > for ParNew: 96 ms, 168972776 reclaimed leaving 1165922488 used; max is
> > 12995002368
> >  INFO [GC inspection] 2010-10-14 19:26:20,780 GCInspector.java (line 129)
> GC
> > for ParNew: 373 ms, 56806184 reclaimed leaving 1336001936 used; max is
> > 12995002368
> > DEBUG [pool-1-thread-11] 2010-10-14 19:26:22,748
> ReadResponseResolver.java
> > (line 120) resolve: 2966 ms.
> > DEBUG [pool-1-thread-11] 2010-10-14 19:26:22,748 StorageProxy.java (line
> > 497) quorumResponseHandler: 6063 ms.
> > DEBUG [pool-1-thread-11] 2010-10-14 19:26:25,983 CassandraServer.java
> (line
> > 664) logout complete
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Reply via email to