The concurrent_reads and concurrent_writes set the number of threads in the 
relevant thread pools. You can view the number of active and queued tasks using 
nodetool tpstats. 

The thread pool uses a blocking linked list for it's work queue with a max size 
of Integer.MAX_VALUE. So it's size is essentially unbounded. When (internode) 
messages are received by a node they are queued into the relevant thread pool 
for processing. When (certain) messages are executed it checks the send time of 
the message and will not process it if it is more than rpc_timeout (typically 
10 seconds) old. This is where the "dropped messages" log messages come from.  

The coordinator will wait up to rpc_timeout for the CL number of nodes to 
respond. So if say one node is under severe load and cannot process the read in 
time, but the others are ok a request at QUORUM would probably succeed. However 
if a number of nodes are getting a beating the co-ordinator may time out 
resulting in the client getting a TimedOutException. 

For the read path it's a little more touchy. Only the "nearest" node is sent a 
request for the actual data, the others are asked for a digest of the data they 
would return. So if the "nearest" node is the one under load and times out the 
request will time out even if CL nodes returned. Thats what the DynamicSnitch 
is there for, a node under load would less likely to be considered the 
"nearest" node. 

The read and write thread pools are really just dealing with reading and 
writing data on the local machine. Your request moves through several other 
threads / thread pools: connection thread, outbound TCP pool, inbound TCP pool 
and message response pool. The SEDA paper referenced on this page was the model 
for using thread pools to manage access to resources 
http://wiki.apache.org/cassandra/ArchitectureInternals

In summary, don't worry about it unless you see the thread pools backing up and 
messages being dropped. 
 
Hope that helps
Aaron

On 28 Mar 2011, at 19:55, Terje Marthinussen wrote:

> Hi, 
> 
> I was pondering about how the concurrent_read and write settings balances 
> towards max read/write threads in clients.
> 
> Lets say we have 3 nodes, and concurrent read/write set to 8.
> That is, 8*3=24 threads for reading and writing.
> 
> Replication factor is 3.
> 
> Lets say we have clients that in total set up 16 connections to each node.
> 
> Now all the clients write at the same time. Since the replication factor is 
> 3, you could get up to 16*3=48  concurrent write request per node (which 
> needs to be handled by 8 threads)?
> 
> What is the result if this load continues?
> Could you see that replication of data fails (at least initially) causing all 
> kinds of fun timeouts around in the system?
> 
> Same on the read side. 
> If all clients read at the same time with Consistency level QUORUM, you get 
> 16*2 read requests in best case (and more in worst case)?
> 
> Could you see that one node answers, but another one times out due to lack of 
> read threads, causing read repair which again further degrades?
> 
> How does this queue up internally between thrift, gossip and the threads 
> doing the actual read and writes? 
> 
> Regards,
> Terje

Reply via email to