> I still don't understand. You would expect read latency to increase > drastically when it's fully saturated and lot of READ drop messages also, > correct? I don't see that in cfstats or system.log which I don't really > understand why.
No. With a fixed concurrency there is only so many outstanding requests at any given moment. Unless the concurrency is sufficiently huge to exhaust queue sizes that's not what you see. It's just the way it works. An analogous situation is this: You have N threads all doing random reads from your disk. If each thread is responsible for exactly one outstanding I/O operation at a time, you will never have more than N outstanding requests to the I/O device in total. On average then, when a request starts it will have N - 1 requests in "front" of it waiting to be processes, so the latency will increase linearly with N. Double the concurrency, double the latency. Something like Cassandra/postgresql/etc behaves similarly. If you're running a benchmark at a fixed concurrency of N, you won't have more than N total requests outstanding at any given moment. -- / Peter Schuller