Pavel,

I don't like an idea of creating separate cache for each data structure,
especially for collocated ones. Can actually, I'm not sure I understand how
that would help. It sounds like that we just need to properly persist the
data structures cache and then reload on restart.

-Val

On Thu, Feb 8, 2018 at 6:12 AM, Pavel Pereslegin <xxt...@gmail.com> wrote:

> Hello, Igniters!
>
> We have some issues with current IgniteSet implementation ([1], [2], [3],
> [4]).
>
> As was already described in this conversation, the main problem is
> that current IgniteSet implementation maintains plain Java sets on
> every node (see CacheDataStructuresManager.setDataMap). These sets
> duplicate backing-cache entries, both primary and backup. size() and
> iterator() calls issue distributed queries to collect/filter data from
> all setDataMap's.
>
> I believe we can solve specified issues if each instance of IgniteSet
> will have separate internal cache that will be destroyed on close.
>
> What do you think about such major change? Do you have any thoughts or
> objections?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-7565
> [2] https://issues.apache.org/jira/browse/IGNITE-5370
> [3] https://issues.apache.org/jira/browse/IGNITE-5553
> [4] https://issues.apache.org/jira/browse/IGNITE-6474
>
>
> 2017-10-31 5:53 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:
> > Hi Andrey,
> >
> > Thanks for a detailed email. I think your suggestions do make sense.
> Ignite
> > cannot afford to have a distributed set that is not fail-safe. Can you
> > please focus only on solutions that provide consistent behavior in case
> of
> > topology changes and failures and document them in the ticket?
> >
> > https://issues.apache.org/jira/browse/IGNITE-5553
> >
> > D.
> >
> > On Mon, Oct 30, 2017 at 3:07 AM, Andrey Kuznetsov <stku...@gmail.com>
> wrote:
> >
> >> Hi, Igniters!
> >>
> >> Current implementation of IgniteSet is fragile with respect to cluster
> >> recovery from a checkpoint. We have an issue (IGNITE-5553) that
> addresses
> >> set's size() behavior, but the problem is slightly broader. The text
> below
> >> is my comment from Jira issue. I encourage you to discuss it.
> >>
> >> We can put current set size into set header cache entry. This will fix
> >> size(), but we have broken iterator() implementation as well.
> >>
> >> Currently, set implementation maintains plain Java sets on every node,
> see
> >> CacheDataStructuresManager.setDataMap. These sets duplicate
> backing-cache
> >> entries, both primary and backup. size() and iterator() calls issue
> >> distributed queries to collect/filter data from all setDataMap's. And
> >> setDataMaps remain empty after cluster is recovered from checkpoint.
> >>
> >> Now I see the following options to fix the issue.
> >>
> >> #1 - Naive. Iterate over all datastructure-backing caches entries during
> >> recover from checkpoint procedure, filter set-related entries and refill
> >> setDataMap's.
> >> Pros: easy to implement
> >> Cons: inpredictable time/memory overhead.
> >>
> >> #2 - More realistic. Avoid node-local copies of cache data. Maintain
> linked
> >> list in datastructure-backing cache: key is set item, value is next set
> >> item. List head is stored in set header cache entry (this set item is
> >> youngest one). Iterators build on top of this structure are fail-fast.
> >> Pros: less memory overhead, no need to maintain node-local mirrors of
> cache
> >> data
> >> Cons: iterators are not fail-safe.
> >>
> >> #3 - Option #2 modified. We can store reference counter and 'removed'
> flag
> >> along with next item reference. This allows to make iterators fail safe.
> >> Pros: iterators are fail-safe
> >> Cons: slightly more complicated implementation, may affect performance,
> >> also I see no way to handle active iterators on remote nodes failures.
> >>
> >>
> >> Best regards,
> >>
> >> Andrey.
> >>
>

Reply via email to