Re: [DISCUSSION] Hot cache backup

Dmitriy Pavlov Thu, 28 Feb 2019 03:44:15 -0800

Hi Maxim,

I agree with Denis and I have just one concern here.


Apache Ignite has quite a long story (started even before Apache), and now
it has a way too huge number of features. Some of these features
- are developed and well known by community members,
- some of them were contributed a long time ago and nobody develops it,
- and, actually, in some rare cases, nobody in the community knows how it
works and how to change it.

Such features may attract users, but a bug in it may ruin impression about
the product. Even worse, nobody can help to solve it, and only user himself
or herself may be encouraged to contribute a fix.

And my concern here, such a big feature should have a number of interested
contributors, who can support it in case if others lost interest. I will be
happy if 3-5 members will come and say, yes, I will do a review/I will help
with further changes.

Just to be clear, I'm not against it, and I'll never cast -1 for it, but it
would be more comfortable to develop this feature with understanding that
this work will not be useless.

Sincerely,
Dmitriy Pavlov

ср, 27 февр. 2019 г. в 23:36, Denis Magda <dma...@apache.org>:

> Maxim,
>
> GridGain has this exact feature available for Ignite native persistence
> deployments. It's not as easy as it might have been seen from the
> enablement perspective. Took us many years to make it production ready,
> involving many engineers. If the rest of the community wants to create
> something similar and available in open source then please take this
> estimate into consideration.
>
> -
> Denis
>
>
> On Wed, Feb 27, 2019 at 8:53 AM Maxim Muzafarov <maxmu...@gmail.com>
> wrote:
>
> > Igniters,
> >
> > Some of the stores with which the Apache Ignite is often compared has
> > a feature called Snapshots [1] [2]. This feature provides an
> > eventually consistent view on stored data for different purposes (e.g.
> > moving data between environments, saving a backup of data for the
> > further restore procedure and so on). The Apache Ignite has all
> > opportunities and machinery to provide cache and\or data region
> > snapshots out of the box but still don't have them.
> >
> > This issue derives from IEP-28 [5] on which I'm currently working on
> > (partially described in the section [6]). I would like to solve this
> > issue too and make Apache Ignite more attractive to use on a
> > production environment. I've haven't investigated in-memory type
> > caches yet, but for caches with enabled persistence, we can do it
> > without any performance impact on cache operations (some additional IO
> > operations are needed to copy cache data to backup store, copy on
> > write technique is used here). We just need to use our DiscoverySpi,
> > PME and Checkpointer process the right way.
> >
> > For the first step, we can store all backup data on each of cache
> > affinity node locally. For instance, the `backup\snapshotId\cache0`
> > folder will be created and all `cache0` partitions will be stored
> > there for each local node for the snapshot process with id
> > `snapshotId`. In future, we can teach nodes to upload snapshotted
> > partitions to the one remote node or cloud.
> >
> > --
> >
> > High-level process overview
> >
> > A new snapshot process is managed via DiscoverySpi and
> > CommunicationSpi messages.
> >
> > 1. The initiator sends a request to the cluster (DiscoveryMessage).
> > 2. When the node receives a message it initiates PME.
> > 3. The node begins checkpoint process (holding write lock a short time)
> > 4. The node starts to track any write attempts to the snapshotting
> > partition and places the copy of original pages to the temp file.
> > 5. The node performs merge the partition file with the corresponding
> delta.
> > 6. When the node finishes the backup process it sends ack message with
> > saved partitions to the initiator (or the error response).
> > 7. When all ack messages received the backup is finished.
> >
> > The only problem here is that when the request message arrives at the
> > particular node during running checkpoint PME will be locked until it
> > ends. This is not good. But hopefully, it will be fixed here [4].
> >
> > --
> >
> > Probable API
> >
> > From the cache perspective:
> >
> > IgniteFuture<IgniteSnapshot> snapshotFut =
> >     ignite.cache("default")
> >         .shapshotter()
> >         .create("myShapshotId");
> >
> > IgniteSnapshot cacheSnapshot = snapshotFut.get();
> >
> > IgniteCache<K, V> copiedCache =
> >     ignite.createCache("CopyCache")
> >         .withConfiguration(defaultCache.getConfiguration())
> >         .loadFromSnapshot(cacheSnapshot.id());
> >
> > From the command line perspective:
> >
> > control.sh --snapshot take cache0,cache1,cache2
> >
> > --
> >
> > WDYT?
> > Will it be a useful feature for the Apache Ignite?
> >
> >
> > [1]
> >
> https://geode.apache.org/docs/guide/10/managing/cache_snapshots/chapter_overview.html
> > [2]
> >
> https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsBackupTakesSnapshot.html
> > [3]
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Data-Snapshots-in-Ignite-td4183.html
> > [4] https://issues.apache.org/jira/browse/IGNITE-10508
> > [5]
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing
> > [6]
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing#IEP-28:Clusterpeer-2-peerbalancing-Checkpointer
> >
>

Re: [DISCUSSION] Hot cache backup

Reply via email to