> > In my case, values are immutable - I never change them, I just add new > entry for newer versions. Does it mean that I won't have any duplicates > between the initial query and listener entries when using continuous > queries on caches supporting MVCC?
I'm afraid there still might be a race. Val, Vladimir, other Ignite experts, please confirm. After reading the related thread ( > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > ) > I'm now concerned about the ordering. My case assumes that there are groups > of entries which belong to a business aggregate object and I would like to > make sure that if I commit two records in two serial transactions then I > have notifications in the same order. Those entries will have different > keys so based on what you said ("we'd better to leave things as is and > guarantee only per-key ordering"), it would seem that the order is not > guaranteed. But do you think it would possible to guarantee order when > those entries share the same affinity key and they belong to the same > partition? The order should be the same for key-value transactions. Vladimir, could you clear out MVCC based behavior? -- Denis On Mon, Dec 17, 2018 at 9:55 AM Piotr Romański <piotr.roman...@gmail.com> wrote: > Hi all, sorry for answering so late. > > I would like to use SqlQuery because I can leverage indexes there. > > As it was already mentioned earlier, the partition update counter is > exposed through CacheQueryEntryEvent. Initially, I thought that the > partition update counter is something what's persisted together with the > data but I'm guessing now that this is only a part of the notification > mechanism. > > I imagined that I would be able to implement my own deduplicaton by having > 3 stages on the client side: 1. Keep processing initial query results, > store their keys in memory, 2. When initial query is over, then process > listener entries but before that check if they have been already delivered > in the first stage, 3. When we are sure that we are already processing > notifications for commits executed after initial query was done, then we > can process listener entries without any additional checks (so our key set > from stage 1 can be removed from memory). The problem is that I have no way > to say that I can move from stage 2 to 3. Another problem is that we need > to stash listener entries while still processing initial query results > causing an excessive memory pressure on our client. > > In my case, values are immutable - I never change them, I just add new > entry for newer versions. Does it mean that I won't have any duplicates > between the initial query and listener entries when using continuous > queries on caches supporting MVCC? > > After reading the related thread ( > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > ) > I'm now concerned about the ordering. My case assumes that there are groups > of entries which belong to a business aggregate object and I would like to > make sure that if I commit two records in two serial transactions then I > have notifications in the same order. Those entries will have different > keys so based on what you said ("we'd better to leave things as is and > guarantee only per-key ordering"), it would seem that the order is not > guaranteed. But do you think it would possible to guarantee order when > those entries share the same affinity key and they belong to the same > partition? > > Piotr > > pt., 14 gru 2018, 19:31: Denis Magda <dma...@apache.org> napisał(a): > > > Vladimir, > > > > Thanks for referring to the MVCC and Continuous Queries discussion, I > knew > > that saw us discussing a solution of the duplication problem. Let me copy > > and paste it in here for others: > > > > 2) *Initial query*. We implemented it so that user can get some initial > > > data snapshot and then start receiving events. Without MVCC we have no > > > guarantees of visibility. E.g. if key is updated from V1 to V2, it is > > > possible to see V2 in initial query and in event. With MVCC it is now > > > technically possible to query data on certain snapshot and then receive > > > only events happened after this snapshot. So that we never see V2 > twice. > > > Do > > > you think we this feature will be interesting for our users? > > > > > > Am I right that this would be a generic solution - whether you use Scan > or > > SQL query as an initial one? Have we planned it for the transactional SQL > > GA or it's out of scope for now? > > > > -- > > Denis > > > > On Thu, Dec 13, 2018 at 12:40 PM Vladimir Ozerov <voze...@gridgain.com> > > wrote: > > > > > [1] > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > > > > > On Thu, Dec 13, 2018 at 11:38 PM Vladimir Ozerov <voze...@gridgain.com > > > > > wrote: > > > > > > > Denis, > > > > > > > > Not really. They are used to ensure that ordering of notifications is > > > > consistent with ordering of updates, so that when a key K is updated > to > > > V1, > > > > then V2, then V3, you never observe V1 -> V3 -> V2. It also solves > > > > duplicate notification problem in case of node failures, when the > same > > > > update is delivered twice. > > > > > > > > However, partition counters are unable to solve duplicates problem in > > > > general. Essentially, the question is how to get consistent view on > > some > > > > data plus all notifications which happened afterwards. There are only > > two > > > > ways to achieve this - either lock entries during initial query, or > > take > > > a > > > > kind of consistent data snapshot. The former was never implemented in > > > > Ignite - our Scan and SQL queries do not user locking. The latter is > > > > achievable in theory with MVCC. I raised that question earlier [1] > (see > > > > p.2), and we came to conclusion that it might be a good feature for > the > > > > product. It is not implemented that way for MVCC now, but most > probably > > > is > > > > not extraordinary difficult to implement. > > > > > > > > Vladimir. > > > > > > > > [1] > > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html#a33998 > > > > > > > > On Thu, Dec 13, 2018 at 11:17 PM Denis Magda <dma...@apache.org> > > wrote: > > > > > > > >> Vladimir, > > > >> > > > >> The partition counter is supposed to be used internally to solve the > > > >> duplication issue. Does it sound like a right approach then? > > > >> > > > >> What would be an approach for SQL queries? Not sure the partition > > > counter > > > >> is applicable. > > > >> > > > >> -- > > > >> Denis > > > >> > > > >> On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov < > > voze...@gridgain.com> > > > >> wrote: > > > >> > > > >> > Partition counter is internal implemenattion detail, which has no > > > >> sensible > > > >> > meaning to end users. It should not be exposed through public API. > > > >> > > > > >> > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda <dma...@apache.org> > > > wrote: > > > >> > > > > >> > > Hello Piotr, > > > >> > > > > > >> > > That's a known problem and I thought a JIRA ticket already > exists. > > > >> > However, > > > >> > > failed to locate it. The ticket for the improvement should be > > > created > > > >> as > > > >> > a > > > >> > > result of this conversation. > > > >> > > > > > >> > > Speaking of an initial query type, I would differentiate from > > > >> ScanQueries > > > >> > > and SqlQueries. For the former, it sounds reasonable to apply > the > > > >> > > partitionCounter logic. As for the latter, Vladimir Ozerov will > it > > > be > > > >> > > addressed as part of MVCC/Transactional SQL activities? > > > >> > > > > > >> > > Btw, Piotr what's your initial query type? > > > >> > > > > > >> > > -- > > > >> > > Denis > > > >> > > > > > >> > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański < > > > >> piotr.roman...@gmail.com > > > >> > > > > > >> > > wrote: > > > >> > > > > > >> > > > Hi, as suggested by Ilya here: > > > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > > > >> > > > I'm resending it to the developers list. > > > >> > > > > > > >> > > > From that thread we know that there might be duplicates > between > > > >> initial > > > >> > > > query results and listener entries received as part of > > continuous > > > >> > query. > > > >> > > > That means that users need to manually dedupe data. > > > >> > > > > > > >> > > > In my opinion the manual deduplication in some use cases may > > lead > > > to > > > >> > > > possible memory problems on the client side. In order to > remove > > > >> > > duplicated > > > >> > > > notifications which we are receiving in the local listener, we > > > need > > > >> to > > > >> > > keep > > > >> > > > all initial query results in memory (or at least their unique > > > ids). > > > >> > > > Unfortunately, there is no way (is there?) to find a point in > > time > > > >> when > > > >> > > we > > > >> > > > can be sure that no dups will arrive anymore. That would mean > > that > > > >> we > > > >> > > need > > > >> > > > to keep that data indefinitely and use it every time a new > > > >> notification > > > >> > > > arrives. In case of multiple continuous queries run from a > > single > > > >> JVM, > > > >> > > this > > > >> > > > might eventually become a memory or performance problem. I can > > see > > > >> the > > > >> > > > following possible improvements to Ignite: > > > >> > > > > > > >> > > > 1. The deduplication between initial query and incoming > > > notification > > > >> > > could > > > >> > > > be done fully in Ignite. As far as I know there is already the > > > >> > > > updateCounter and partition id for all the objects so it could > > be > > > >> used > > > >> > > > internally. > > > >> > > > > > > >> > > > 2. Add a guarantee that notifications arriving in the local > > > listener > > > >> > > after > > > >> > > > query() method returns are not duplicates. This kind of > > > >> functionality > > > >> > > would > > > >> > > > require a specific synchronization inside Ignite. It would > also > > > mean > > > >> > that > > > >> > > > the query() method cannot return before all potential > duplicates > > > are > > > >> > > > processed by a local listener what looks wrong. > > > >> > > > > > > >> > > > 3. Notify users that starting from a given notification they > can > > > be > > > >> > sure > > > >> > > > they will not receive any duplicates anymore. This could be an > > > >> > additional > > > >> > > > boolean flag in the CacheQueryEntryEvent. > > > >> > > > > > > >> > > > 4. CacheQueryEntryEvent already exposes the > > > partitionUpdateCounter. > > > >> > > > Unfortunately we don't have this information for initial query > > > >> results. > > > >> > > If > > > >> > > > we had, a client could manually deduplicate notifications and > > get > > > >> rid > > > >> > of > > > >> > > > initial query results for a given partition after newer > > > >> notifications > > > >> > > > arrive. Also it would be very convenient to expose partition > id > > as > > > >> well > > > >> > > but > > > >> > > > now we can figure it out using the affinity service. The > > > assumption > > > >> > here > > > >> > > is > > > >> > > > that notifications are ordered by partitionUpdateCounter (is > it > > > >> true?). > > > >> > > > > > > >> > > > Please correct me if I'm missing anything. > > > >> > > > > > > >> > > > What do you think? > > > >> > > > > > > >> > > > Piotr > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > >