Stephen, > Nodes check on their neighbours and notify the remaining nodes if one disappears. Could you explain how this works in detail? How can I set/change check frequency?
On Wed, Apr 8, 2020 at 11:13 AM Stephen Darlington < stephen.darling...@gridgain.com> wrote: > This is one of the functions of the DiscoverySPI. Nodes check on their > neighbours and notify the remaining nodes if one disappears. When the > topology changes, it triggers a rebalance, which relocates primary > partitions to live nodes. This is entirely transparent to clients. > > It gets more complex… like there’s the partition loss policy and > rebalancing doesn’t always happen (configurable, persistence, etc)… but > broadly it does as you expect. > > Regards, > Stephen > > > On 8 Apr 2020, at 08:40, Anton Vinogradov <a...@apache.org> wrote: > > > > Igniters, > > Do we have some feature allows to check nodes aliveness on a regular > basis? > > > > Scenario: > > Precondition > > The cluster has no load but some node's JVM crashed. > > > > Expected actual > > The user performs an operation (eg. cache put) related to this node (via > > another node) and waits for some timeout to gain it's dead. > > The cluster starts the switch to relocate primary partitions to alive > > nodes. > > Now user able to retry the operation. > > > > Desired > > Some WatchDog checks nodes aliveness on a regular basis. > > Once a failure detected, the cluster starts the switch. > > Later, the user performs an operation on an already fixed cluster and > > waits for nothing. > > > > It would be good news if the "Desired" case is already Actual. > > Can somebody point to the feature that performs this check? > > >