> > iostat doesn't show a request queue bottleneck. The timeouts we are seeing > is for reads. The latency on the nodes I have temporarily used for reads is > around 2-45ms. The next token in the ring at an alternate DC is showing ~4ms > with everything else around 0.05ms. tpstats desn't show any active/pending. > Reads are at CL.ONE & Writes using CL.ANY
OK, node latency is fine and you are using some pretty low consistency. You said NTS with RF 2, is that RF 2 for each DC ? The steps below may help get an idea of whats going on… 1) use nodetool getendpoints to determine which replicas a key is. 2) connect directly to one of the endpoints with the CLI, ensure CL is ONE and do your test query. 3) connect to another node in the same DC that is not a replica and do the same. 4) connect to another node in a different DC and do the same Once you can repo it try turning up the logging not the coordinator to DEBUG you can do this via JConsole. Look for these lines…. * Command/ConsistencyLevel is…. * reading data locally... or reading data from… * reading digest locally… or reading digest for from… * Read timeout:… You'll also see some lines about receiving messages from other nodes. Hopefully you can get an idea of which nodes are involved in a failing query. Getting a thrift TimedOutException on a read with CL ONE is pretty odd. > What can I do in regards to confirming this issue is still outstanding and/or > we are affected by it? It's in 0.8 and will not be fixed. My unscientific approach was to repair a single CF at a time, hoping that the differences would be smaller and less data would be streamed. Minor compaction should help squish things down. If you want to get more aggressive reduce the min compaction threshold and trigger a minor compaction with nodetool flush. > Version of failure detection? I've not seen anything on this so I suspect > this is the default. Was asking so I could see if their were any fixed in Gossip or the FailureDetect that you were missing. Check the CHANGES.txt file. Hope that helps. ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 12 Aug 2011, at 12:48, Anton Winter wrote: > >> Is there a reason you are using the trunk and not one of the tagged >> releases? Official releases are a lot more stable than the trunk. >> > Yes, as we are using a combination of Ec2 and colo servers we are needing to > use broadcast_address from CASSANDRA-2491. The patch that is associated with > that JIRA does not apply cleanly against 0.8 so this is why we are using > trunk. > >>> 1) thrift timeouts & general degraded response times >> For read or writes ? What sort of queries are you running ? Check the local >> latency on each node using cfstats and cfhistogram, and a bit of iostat >> http://spyced.blogspot.com/2010/01/linux-performance-basics.html What does >> nodetool tpstats say, is there a stage backing up? >> >> If the local latency is OK look at the cross DC situation. What CL are you >> using? Are nodes timing out waiting for nodes in other DC's ? > > iostat doesn't show a request queue bottleneck. The timeouts we are seeing > is for reads. The latency on the nodes I have temporarily used for reads is > around 2-45ms. The next token in the ring at an alternate DC is showing ~4ms > with everything else around 0.05ms. tpstats desn't show any active/pending. > Reads are at CL.ONE & Writes using CL.ANY > >> >>> 2) *lots* of exception errors, such as: >> Repair is trying to run on a response which is a digest response, this >> should not be happening. Can you provide some more info on the type of query >> you are running ? >> > The query being run is get cf1['user-id']['seg'] > > >>> 3) ring imbalances during a repair (refer to the above nodetool ring output) >> You may be seeing this >> https://issues.apache.org/jira/browse/CASSANDRA-2280 >> I think it's a mistake that is it marked as resolved. >> > What can I do in regards to confirming this issue is still outstanding and/or > we are affected by it? > >>> 4) regular failure detection when any node does something only moderately >>> stressful, such as a repair or are under light load etc. but the node >>> itself thinks it is fine. >> What version are you using ? >> > Version of failure detection? I've not seen anything on this so I suspect > this is the default. > > > Thanks, > Anton >