Re: Losing data during restarting cluster with persistence enabled

Vyacheslav Daradur Wed, 27 Dec 2017 01:59:31 -0800

Hi, looks like there is no much profit when PDS throttling is enabled
and tuned according to an article [1].


I’ve benchmarked the solutions with ‘put’ operation for 3 hours via
Ignite Yardstick. I see quite similar results with the write-heavy
pattern. Most time PDS works ~10% faster. Only one thing looks
strange: PDS has degradation over time in comparison with RocksDB.

[1] https://apacheignite.readme.io/docs/durable-memory-tuning


On Wed, Dec 6, 2017 at 9:24 PM, Valentin Kulichenko
<valentin.kuliche...@gmail.com> wrote:
> Vyacheslav,
>
> In this case community should definitely take a look and investigate.
> Please share your results when you have a chance.
>
> -Val
>
> On Wed, Dec 6, 2017 at 1:45 AM, Vyacheslav Daradur <daradu...@gmail.com>
> wrote:
>
>> Evgeniy, as far as I understand PDS and rebalancing are based on
>> page-memory approach instead of entry-based 3rd Party Persistence, so
>> I'm not sure how to extend rebalancing behavior properly.
>>
>> Dmitry, the performance is the only reason of why I try to solve
>> rebalancing issue.
>> I've benchmarked RocksDB as 3rd party persistence and PDS via Ignite
>> Yardstick with "fsync" enabled in both cases.
>> The result shows that PDS is twice slower on "put" operation on the
>> single node, but I had had no time to do benchmarks on all sides.
>> I'll try to do that next week and will share results if the community
>> is interested. Maybe there will be no reason for using RocksDB.
>>
>>
>>
>> On Fri, Nov 24, 2017 at 4:58 PM, Dmitry Pavlov <dpavlov....@gmail.com>
>> wrote:
>> > Please see the discussion on the user list. It seems that the same
>> happened
>> > there:
>> >
>> > http://apache-ignite-users.70518.x6.nabble.com/Reassign-
>> partitions-td7461.html#a7468
>> >
>> > it contains examples when the data can diverge.
>> >
>> > пт, 24 нояб. 2017 г. в 16:42, Dmitry Pavlov <dpavlov....@gmail.com>:
>> >
>> >> If we compare native and 3rd party persistence (cache store):
>> >>  - Updating and reading data from DBMS is slower in most scenarios.
>> >>  - Non-clustered DBMS is a single point of failure, it is hard to scale.
>> >>  - Ignite SQL does not extend to External (3rd party persitsence) Cache
>> >> Store (and queries ignore DBMS changes).
>> >>
>> >>
>> >> Which is why I am wondering if Native persistence is applicable in this
>> >> case decribed by Vyacheslav.
>> >>
>> >> пт, 24 нояб. 2017 г. в 12:23, Evgeniy Ignatiev <
>> >> yevgeniy.ignat...@gmail.com>:
>> >>
>> >>> Sorry linked the wrong page, the latter url is not the example.
>> >>>
>> >>>
>> >>> On 11/24/2017 1:12 PM, Evgeniy Ignatiev wrote:
>> >>> > By the way I remembered that there is an annotation CacheLocalStore
>> >>> > for marking exactly the CacheStore that is not distributed -
>> >>> >
>> >>> http://apache-ignite-developers.2346864.n4.nabble.
>> com/CacheLocalStore-td734.html
>> >>> > - here is short explanation and this -
>> >>> >
>> >>> https://github.com/gridgain/gridgain-advanced-examples/
>> blob/master/src/main/java/org/gridgain/examples/localstore/
>> LocalRecoverableStoreExample.java
>> >>> > - is example implementation.
>> >>> >
>> >>> >
>> >>> > On 11/23/2017 4:42 PM, Dmitry Pavlov wrote:
>> >>> >> Hi Evgeniy,
>> >>> >>
>> >>> >> Technically it is, of course, possible, but still
>> >>> >> - it is not simple at all
>> >>> >> - IgniteCacheOffheapManager & IgniteWriteAheadLogManager are
>> internal
>> >>> >> APIs,
>> >>> >> and community can change any APIs here in any time.
>> >>> >>
>> >>> >> Vyacheslav,
>> >>> >>
>> >>> >> Why Ignite Native Persistence is not suitable for this case?
>> >>> >>
>> >>> >> Sincerely,
>> >>> >> Dmitriy Pavlov
>> >>> >>
>> >>> >> чт, 23 нояб. 2017 г. в 11:01, Evgeniy Ignatiev
>> >>> >> <yevgeniy.ignat...@gmail.com
>> >>> >>> :
>> >>> >>> As far as I remember, last webinar I heard on Ignite Native
>> >>> Persistence
>> >>> >>> - it actually exposes some interfaces like
>> IgniteWriteAheadLogManager,
>> >>> >>> PageStore, PageStoreManager, etc., with the file-based
>> implementation
>> >>> >>> provided by Ignite being only one possible approach, and users can
>> >>> >>> create their own Native Persistence variations. At least that what
>> has
>> >>> >>> been said by Denis Magda at that time.
>> >>> >>>
>> >>> >>> May be creating own implementation of Ignite Native Persistence
>> rather
>> >>> >>> than CacheStore based persistence is an option here?
>> >>> >>>
>> >>> >>> On 11/23/2017 2:23 AM, Valentin Kulichenko wrote:
>> >>> >>>> Vyacheslav,
>> >>> >>>>
>> >>> >>>> There is no way to do this and I'm not sure why you want to do
>> this.
>> >>> >>> Ignite
>> >>> >>>> persistence was developed to solve exactly the problems you're
>> >>> >>> describing.
>> >>> >>>> Just use it :)
>> >>> >>>>
>> >>> >>>> -Val
>> >>> >>>>
>> >>> >>>> On Wed, Nov 22, 2017 at 12:36 AM, Vyacheslav Daradur <
>> >>> >>> daradu...@gmail.com>
>> >>> >>>> wrote:
>> >>> >>>>
>> >>> >>>>> Valentin, Evgeniy thanks for your help!
>> >>> >>>>>
>> >>> >>>>> Valentin, unfortunately, you are right.
>> >>> >>>>>
>> >>> >>>>> I've tested that behavior in the following scenario:
>> >>> >>>>> 1. Started N nodes and filled it with data
>> >>> >>>>> 2. Shutdown one node
>> >>> >>>>> 3. Called rebalance directly and waited to finish
>> >>> >>>>> 4. Stopped all other (N-1) nodes
>> >>> >>>>> 5. Started N-1 nodes and validated data
>> >>> >>>>>
>> >>> >>>>> Validation didn't pass - data consistency was broken. As you say
>> it
>> >>> >>>>> works only on stable topology.
>> >>> >>>>> As far as I understand Ignite doesn't manage to rebalance in
>> >>> >>>>> underlying storage, it became clear from tests and your
>> description
>> >>> >>>>> that CacheStore design assumes that the underlying storage is
>> shared
>> >>> >>>>> by all the
>> >>> >>>>> nodes in the topology.
>> >>> >>>>>
>> >>> >>>>> I understand that PDS is the best option in case of distributing
>> >>> >>>>> persistence.
>> >>> >>>>> However, could you point me the best way to override default
>> >>> >>>>> rebalance
>> >>> >>>>> behavior?
>> >>> >>>>> Maybe it's possible to extend it by a custom plugin?
>> >>> >>>>>
>> >>> >>>>> On Wed, Nov 22, 2017 at 1:35 AM, Valentin Kulichenko
>> >>> >>>>> <valentin.kuliche...@gmail.com> wrote:
>> >>> >>>>>> Vyacheslav,
>> >>> >>>>>>
>> >>> >>>>>> If you want the persistence storage to be *distributed*, then
>> using
>> >>> >>>>> Ignite
>> >>> >>>>>> persistence would be the easiest thing to do anyway, even if you
>> >>> >>>>>> don't
>> >>> >>>>> need
>> >>> >>>>>> all its features.
>> >>> >>>>>>
>> >>> >>>>>> CacheStore indeed can be updated from different nodes with
>> >>> different
>> >>> >>>>> nodes,
>> >>> >>>>>> but the problem is in coordination. If instances of the store
>> are
>> >>> >>>>>> not
>> >>> >>>>> aware
>> >>> >>>>>> of each other, it's really hard to handle all rebalancing cases.
>> >>> >>>>>> Such
>> >>> >>>>>> solution will work only on stable topology.
>> >>> >>>>>>
>> >>> >>>>>> Having said that, if you can have one instance of RocksDB (or
>> any
>> >>> >>>>>> other
>> >>> >>>>> DB
>> >>> >>>>>> for that matter) that is accessed via network by all nodes, then
>> >>> >>>>>> it's
>> >>> >>>>> also
>> >>> >>>>>> an option. But in this case storage is not distributed.
>> >>> >>>>>>
>> >>> >>>>>> -Val
>> >>> >>>>>>
>> >>> >>>>>> On Tue, Nov 21, 2017 at 4:37 AM, Vyacheslav Daradur <
>> >>> >>> daradu...@gmail.com
>> >>> >>>>>> wrote:
>> >>> >>>>>>
>> >>> >>>>>>> Valentin,
>> >>> >>>>>>>
>> >>> >>>>>>>>> Why don't you use Ignite persistence [1]?
>> >>> >>>>>>> I have a use case for one of the projects that need the RAM on
>> >>> disk
>> >>> >>>>>>> replication only. All PDS features aren't needed.
>> >>> >>>>>>> During the first assessment, persist to RocksDB works faster.
>> >>> >>>>>>>
>> >>> >>>>>>>>> CacheStore design assumes that the underlying storage is
>> >>> >>>>>>>>> shared by
>> >>> >>>>> all
>> >>> >>>>>>> the nodes in topology.
>> >>> >>>>>>> This is the very important note.
>> >>> >>>>>>> I'm a bit confused because I've thought that each node in
>> cluster
>> >>> >>>>>>> persists partitions for which the node is either primary or
>> backup
>> >>> >>>>>>> like in PDS.
>> >>> >>>>>>>
>> >>> >>>>>>> My RocksDB implementation supports working with one DB instance
>> >>> >>>>>>> which
>> >>> >>>>>>> shared by all the nodes in the topology, but it would make no
>> >>> >>>>>>> sense of
>> >>> >>>>>>> using embedded fast storage.
>> >>> >>>>>>>
>> >>> >>>>>>> Is there any link to a detailed description of CacheStorage
>> >>> >>>>>>> design or
>> >>> >>>>>>> any other advice?
>> >>> >>>>>>> Thanks in advance.
>> >>> >>>>>>>
>> >>> >>>>>>>
>> >>> >>>>>>>
>> >>> >>>>>>> On Fri, Nov 17, 2017 at 9:07 PM, Valentin Kulichenko
>> >>> >>>>>>> <valentin.kuliche...@gmail.com> wrote:
>> >>> >>>>>>>> Vyacheslav,
>> >>> >>>>>>>>
>> >>> >>>>>>>> CacheStore design assumes that the underlying storage is
>> shared
>> >>> by
>> >>> >>> all
>> >>> >>>>>>> the
>> >>> >>>>>>>> nodes in topology. Even if you delay rebalancing on node stop
>> >>> >>>>>>>> (which
>> >>> >>>>> is
>> >>> >>>>>>>> possible via CacheConfiguration#rebalanceDelay), I doubt it
>> will
>> >>> >>>>> solve
>> >>> >>>>>>> all
>> >>> >>>>>>>> your consistency issues.
>> >>> >>>>>>>>
>> >>> >>>>>>>> Why don't you use Ignite persistence [1]?
>> >>> >>>>>>>>
>> >>> >>>>>>>> [1]
>> >>> >>>>>>>> https://apacheignite.readme.io/docs/distributed-
>> persistent-store
>> >>> >>>>>>>>
>> >>> >>>>>>>> -Val
>> >>> >>>>>>>>
>> >>> >>>>>>>> On Fri, Nov 17, 2017 at 4:24 AM, Vyacheslav Daradur <
>> >>> >>>>> daradu...@gmail.com
>> >>> >>>>>>>> wrote:
>> >>> >>>>>>>>
>> >>> >>>>>>>>> Hi Andrey! Thank you for answering.
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>>> Key to partition mapping shouldn't depends on topology, and
>> >>> >>>>> shouldn't
>> >>> >>>>>>>>> changed unstable topology.
>> >>> >>>>>>>>> Key to partition mapping doesn't depend on topology in my
>> test
>> >>> >>>>>>>>> affinity function. It only depends on partitions number.
>> >>> >>>>>>>>> But partition to node mapping depends on topology and at
>> cluster
>> >>> >>>>> stop,
>> >>> >>>>>>>>> when one node left topology, some partitions may be moved to
>> >>> >>>>>>>>> other
>> >>> >>>>>>>>> nodes.
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>>> Does all nodes share same RockDB database or each node has
>> >>> >>>>>>>>>>> its own
>> >>> >>>>>>> copy?
>> >>> >>>>>>>>> Each Ignite node has own RocksDB instance.
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>>> Would you please share configuration?
>> >>> >>>>>>>>> It's pretty simple:
>> >>> >>>>>>>>>           IgniteConfiguration cfg = new
>> IgniteConfiguration();
>> >>> >>>>>>>>>           cfg.setIgniteInstanceName(instanceName);
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>           CacheConfiguration<Integer, String> cacheCfg = new
>> >>> >>>>>>>>> CacheConfiguration<>();
>> >>> >>>>>>>>>           cacheCfg.setName(TEST_CACHE_NAME);
>> >>> >>>>>>>>> cacheCfg.setCacheMode(CacheMode.PARTITIONED);
>> >>> >>>>>>>>>           cacheCfg.setWriteSynchronizationMode(
>> >>> >>>>>>>>> CacheWriteSynchronizationMode.PRIMARY_SYNC);
>> >>> >>>>>>>>>           cacheCfg.setBackups(1);
>> >>> >>>>>>>>>           cacheCfg.setAffinity(new
>> >>> >>>>>>>>> TestAffinityFunction(partitionsNumber, backupsNumber));
>> >>> >>>>>>>>>           cacheCfg.setWriteThrough(true);
>> >>> >>>>>>>>>           cacheCfg.setReadThrough(true);
>> >>> >>>>>>>>> cacheCfg.setRebalanceMode(CacheRebalanceMode.SYNC);
>> >>> >>>>>>>>>           cacheCfg.setCacheStoreFactory(new
>> >>> >>>>>>>>> RocksDBCacheStoreFactory<>("/test/path/to/persistence",
>> >>> >>>>>>>>> TEST_CACHE_NAME, cfg));
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>           cfg.setCacheConfiguration(cacheCfg);
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> Could you give me advice on places which I need to pay
>> >>> attention?
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> On Wed, Nov 15, 2017 at 3:02 PM, Andrey Mashenkov
>> >>> >>>>>>>>> <andrey.mashen...@gmail.com> wrote:
>> >>> >>>>>>>>>> Hi Vyacheslav,
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> Key to partition mapping shouldn't depends on topology, and
>> >>> >>>>> shouldn't
>> >>> >>>>>>>>>> changed unstable topology.
>> >>> >>>>>>>>>> Looks like you've missed smth.
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> Would you please share configuration?
>> >>> >>>>>>>>>> Does all nodes share same RockDB database or each node has
>> >>> >>>>>>>>>> its own
>> >>> >>>>>>> copy?
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> On Wed, Nov 15, 2017 at 12:22 AM, Vyacheslav Daradur <
>> >>> >>>>>>>>> daradu...@gmail.com>
>> >>> >>>>>>>>>> wrote:
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>>> Hi, Igniters!
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> I’m using partitioned Ignite cache with RocksDB as 3rd
>> party
>> >>> >>>>>>> persistence
>> >>> >>>>>>>>>>> store.
>> >>> >>>>>>>>>>> I've got an issue: if cache rebalancing is switched on,
>> then
>> >>> >>>>>>>>>>> it’s
>> >>> >>>>>>>>>>> possible to lose some data.
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> Basic scenario:
>> >>> >>>>>>>>>>> 1) Start Ignite cluster and fill a cache with RocksDB
>> >>> >>>>>>>>>>> persistence;
>> >>> >>>>>>>>>>> 2) Stop all nodes
>> >>> >>>>>>>>>>> 3) Start Ignite cluster and validate data
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> This works fine while rebalancing is switched off.
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> If rebalancing switched on: when I call Ignition#stopAll,
>> some
>> >>> >>>>> nodes
>> >>> >>>>>>>>>>> go down sequentially and while one node having gone down
>> >>> >>>>>>>>>>> another
>> >>> >>>>>>> start
>> >>> >>>>>>>>>>> rebalancing. When nodes started affinity function works
>> with a
>> >>> >>>>> full
>> >>> >>>>>>>>>>> set of nodes and may define a wrong partition for a key
>> >>> because
>> >>> >>>>> the
>> >>> >>>>>>>>>>> previous state was changed at rebalancing.
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> Maybe I'm doing something wrong. How can I avoid
>> rebalancing
>> >>> >>>>>>>>>>> while
>> >>> >>>>>>>>>>> stopping all nodes in the cluster?
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> Could you give me any advice, please?
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> --
>> >>> >>>>>>>>>>> Best Regards, Vyacheslav D.
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> --
>> >>> >>>>>>>>>> Best regards,
>> >>> >>>>>>>>>> Andrey V. Mashenkov
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> --
>> >>> >>>>>>>>> Best Regards, Vyacheslav D.
>> >>> >>>>>>>>>
>> >>> >>>>>>>
>> >>> >>>>>>> --
>> >>> >>>>>>> Best Regards, Vyacheslav D.
>> >>> >>>>>>>
>> >>> >>>>>
>> >>> >>>>> --
>> >>> >>>>> Best Regards, Vyacheslav D.
>> >>> >>>>>
>> >>> >>>
>> >>> >
>> >>>
>> >>>
>>
>>
>>
>> --
>> Best Regards, Vyacheslav D.
>>



-- 
Best Regards, Vyacheslav D.

Re: Losing data during restarting cluster with persistence enabled

Reply via email to