Re: enabling network partition detection by default

Bruce Schuchardt Fri, 08 Jan 2016 10:45:19 -0800

The default member-timeout is 5 seconds. For an unpredictable networkor a system with GC pauses we might want to use a longer member-timeoutin deployment. Network-partition-detection isn't involved in thatthough - it's just normal failure detection.

Where network-partition-detection would cause harm is in a smalldeployment: say 2 servers & 1 locator. If the "lead" server is kickedout this would cause both the locator and other server to shut-downbecause the membership weight was 28 and 15 of that was lost. Theywould all restart after a default delay of 1 minute using theauto-reconnect feature, which is enabled by default.



Le 1/8/2016 8:13 AM, Real Wes Williams a écrit :

What’s the level of concern here about members getting kicked out prematurely 
depending on the newly proposed default settings?  For instance, if the default 
suspect notification is 3 seconds and they are running in AWS or a mildly 
unpredictable network environment, a member could be kicked out.  What would be 
considered “safe” settings?

On Jan 7, 2016, at 4:18 PM, Bruce Schuchardt <[email protected]> wrote:

Another thing that's been discussed for a long time is turning on 
network-partition-detection by default.   It is a major problem for someone if 
a partition occurs and they are using persistence.  The disk-stores on all but 
one of the partitions have to be deleted and revoked.

Re: enabling network partition detection by default

Reply via email to