Thanks Brandon.

I suspected that, but I think that's precluded as a possibility since
I setup another background job to do
echo | nc other_box 7000
in a loop,
this job seems to be working fine all the time, so network seems fine.

Yang

On Sun, Sep 25, 2011 at 10:39 AM, Brandon Williams <dri...@gmail.com> wrote:
> On Sat, Sep 24, 2011 at 4:54 PM, Yang <teddyyyy...@gmail.com> wrote:
>> I'm using 1.0.0
>>
>>
>> there seems to be too many node Up/Dead events detected by the failure
>> detector.
>> I'm using  a 2 node cluster on EC2, in the same region, same security
>> group, so I assume the message drop
>> rate should be fairly low.
>> but in about every 5 minutes, I'm seeing some node detected as down,
>> and then Up again quickly
>
> This is fairly common on ec2 due to wild variance in the network.
> Increase your phi_convict_threshold to 10 or higher (but I wouldn't go
> over 12, this is roughly an exponential increase)
>
> -Brandon
>

Reply via email to