Ah, got it. Thanks for clearing that up!
On Sat, Dec 4, 2010 at 11:56 AM, Daniel Doubleday
wrote:
> Ah ok. No that was not the case.
>
> The client which did the long running scan didn't wait for the slowest node.
> Only other clients that asked the slow node directly were affected.
>
> Sorry ab
Ah ok. No that was not the case.
The client which did the long running scan didn't wait for the slowest node.
Only other clients that asked the slow node directly were affected.
Sorry about the confusion.
On 04.12.10 05:44, Jonathan Ellis wrote:
That makes sense, but this shouldn't make reque
That makes sense, but this shouldn't make requests last for the
timeout duration -- at quorum, it should be responding to the client
as soon as it gets that second-fastest reply. If I'm understanding
right that this was making the response to the client block until the
overwhelmed node timed out,
Yes.
I thought that would make sense, no? I guessed that the quorum read
forces the slowest of the 3 nodes to keep the pace of the faster ones.
But it cant. No matter how small the performance diff is. So it will
just fill up.
Also when saying 'practically dead' and 'never recovers' I meant
Am I understanding correctly that you had all connections going to one
cassandra node, which caused one of the *other* nodes to die, and
spreading the connections around the cluster fixed it?
On Fri, Dec 3, 2010 at 4:00 AM, Daniel Doubleday
wrote:
> Hi all
>
> I have found an anti pattern the oth
Hi all
I have found an anti pattern the other day which I wanted to share, although
its pretty special case.
Special case because our production cluster is somewhat strange: 3 servers, rf
= 3. We do consistent reads/writes with quorum.
I did a long running read series (loads of reads as fast a