Re: Live fallback/backup scenario

Pavel Tupitsyn Sat, 02 Jul 2022 10:30:28 -0700

> a) The persistent directory on the crashed server is still available
> (but most likely is not up-to-date with latest data from the other
> caches). Do I add the server to the base topology and it will
> automatically get the data that it missed during its downtime?


Restarted node is still a member of the baseline topology. You don't need
to do anything, the latest data will be synced automatically, and the
cluster continues to operate seamlessly.


> b) The server was wiped and the persistent directory is empty when it's
> restarted. Do I add the server to the base topology and it will
> automatically participate in the shared data caches to become a
> potential full backup in case another server crashes?

As long as IgniteConfiguration.consistentId is the same as before restart,
again, no manual steps are required. Baseline remains the same and wiped
data will be synced up from other nodes.

If consistentId changes, it is considered a new node, not a member of
baseline. You can add it to the baseline manually.


> How long does such data sync for the restarted server take? Is there an
> event for this?

The time depends on the data size and hardware, from seconds to minutes or
hours.
The event is CacheRebalancingEvent.

https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/events/CacheRebalancingEvent.html

On Sat, Jul 2, 2022 at 3:40 PM <don.tequ...@gmx.de> wrote:

> Hi Igniters,
>
> can you please comment or correct the scenario described below? I assume
> zero-downtime and fault-tolerance against data loss on node crashes is a
> key feature of Ignite but the procedure in failure scenarios is not
> clear to me.
>
> Thanks!
>
> On 19.06.22 11:32, jay.et...@gmx.de wrote:
> > Hi,
> >
> > yes, your 4-point initial setup is how we've been operating our cluster
> so far. For maintenance we go in reverse order (leave out point 2).
> >
> > I'm very curious myself about the second part of your mail on how to
> correctly restore the cluster after a node crash.
> >
> > Can anyone comment on this? Any help appreciated here.
> >
> > Jay
> >
> >
> >
> > -----Original Message-----
> > From: don.tequ...@gmx.de <don.tequ...@gmx.de>
> > Sent: Saturday, 11 June 2022 14:33
> > To: user@ignite.apache.org
> > Subject: Live fallback/backup scenario
> >
> > Hi,
> >
> > I'm experimenting with a Ignite cluster with multiple server nodes and
> multiple client nodes. My understanding is that with Ignite I can avoid
> data loss of all persistent caches and can avoid downtime for all clients.
> >
> > If the above assumption is correct, how do I manage the servers and
> baseline topology for this scenario?
> >
> > Caches are configured with persistence enabled and:
> >               <property name="cacheMode" value="PARTITIONED" />
> >               <property name="backups" value="1" />
> >               <property name="atomicityMode" value="TRANSACTIONAL"/>
> >               <property name="writeSynchronizationMode"
> value="FULL_SYNC"/>
> >
> > Is this procedure correct for initial setup?
> >
> > 1. Start all server nodes and keep the cluster inactive.
> > 2. Once all server nodes are connected, set the baseline topology to all
> server nodes.
> > 3. Activate the cluster.
> > 4. Connect clients and start application operation with compute and
> persistent caches.
> >
> > Let's say one server node crashes, I can see operation continues without
> interruption and no data loss. However, what's the scenario after the
> crashed server node was restarted and connected again to the cluster? I can
> see it does not automatically get a member of the baseline topology.
> >
> > What's the correct procedure in the two below scenarios:
> >
> > a) The persistent directory on the crashed server is still available
> (but most likely is not up-to-date with latest data from the other caches).
> Do I add the server to the base topology and it will automatically get the
> data that it missed during its downtime?
> >
> > b) The server was wiped and the persistent directory is empty when it's
> restarted. Do I add the server to the base topology and it will
> automatically participate in the shared data caches to become a potential
> full backup in case another server crashes?
> >
> > How long does such data sync for the restarted server take? Is there an
> event for this?
> >
> > Thanks for some background on this scenario.
> >
>

Re: Live fallback/backup scenario

Reply via email to