Re: Dont bogart that connection my friend

2010-12-04 Thread Jonathan Ellis
Ah, got it. Thanks for clearing that up! On Sat, Dec 4, 2010 at 11:56 AM, Daniel Doubleday wrote: > Ah ok. No that was not the case. > > The client which did the long running scan didn't wait for the slowest node. > Only other clients that asked the slow node directly were affected. > > Sorry ab

Re: Dont bogart that connection my friend

2010-12-04 Thread Daniel Doubleday
Ah ok. No that was not the case. The client which did the long running scan didn't wait for the slowest node. Only other clients that asked the slow node directly were affected. Sorry about the confusion. On 04.12.10 05:44, Jonathan Ellis wrote: That makes sense, but this shouldn't make reque

Re: Dont bogart that connection my friend

2010-12-03 Thread Jonathan Ellis
That makes sense, but this shouldn't make requests last for the timeout duration -- at quorum, it should be responding to the client as soon as it gets that second-fastest reply. If I'm understanding right that this was making the response to the client block until the overwhelmed node timed out,

Re: Dont bogart that connection my friend

2010-12-03 Thread Daniel Doubleday
Yes. I thought that would make sense, no? I guessed that the quorum read forces the slowest of the 3 nodes to keep the pace of the faster ones. But it cant. No matter how small the performance diff is. So it will just fill up. Also when saying 'practically dead' and 'never recovers' I meant

Re: Dont bogart that connection my friend

2010-12-03 Thread Jonathan Ellis
Am I understanding correctly that you had all connections going to one cassandra node, which caused one of the *other* nodes to die, and spreading the connections around the cluster fixed it? On Fri, Dec 3, 2010 at 4:00 AM, Daniel Doubleday wrote: > Hi all > > I have found an anti pattern the oth

Dont bogart that connection my friend

2010-12-03 Thread Daniel Doubleday
Hi all I have found an anti pattern the other day which I wanted to share, although its pretty special case. Special case because our production cluster is somewhat strange: 3 servers, rf = 3. We do consistent reads/writes with quorum. I did a long running read series (loads of reads as fast a