Ready for review:
https://github.com/apache/ignite/pull/8870

On Sun, Mar 21, 2021 at 8:09 PM Pavel Tupitsyn <ptupit...@apache.org> wrote:

> Simple benchmark added - see JmhCacheAsyncListenBenchmark in the PR.
> There is a 6-8% drop (1 client, 2 servers, 1 machine, int key/val).
> I expect this difference to become barely observable on real-world
> workloads.
>
> On Thu, Mar 18, 2021 at 12:35 PM Pavel Tupitsyn <ptupit...@apache.org>
> wrote:
>
>> Denis,
>>
>> For a reproducer, please see CacheAsyncContinuationExecutorTest.java in
>> the linked PoC [1]
>>
>> [1]
>> https://github.com/apache/ignite/pull/8870/files#diff-c788c12013622093df07d8f628a6e8c5fb0c15007f0787f2d459dbb3e377fc5aR54
>>
>> On Thu, Mar 18, 2021 at 1:58 AM Raymond Wilson <
>> raymond_wil...@trimble.com> wrote:
>>
>>> We implemented the ContinueWith() suggestion from Pavel, and it works
>>> well
>>> so far in testing, though we do not have data to support if there is or
>>> is
>>> not a performance penalty in our use case..
>>>
>>> To lend another vote to the 'Don't do continuations on the striped thread
>>> pool' line of thinking: Deadlocking is an issue as is starvation. In some
>>> ways starvation is more insidious because by the time things stop working
>>> the cause and effect distance may be large.
>>>
>>> I appreciate the documentation does make statements about not performing
>>> cache operations in a continuation due to deadlock possibilities, but
>>> that
>>> statement does not reveal why this is an issue. It is less a case of a
>>> async cache operation being followed by some other cache operation (an
>>> immediate issue), and more a general case of the continuation of
>>> application logic using a striped pool thread in a way that might mean
>>> that
>>> thread is never given back - it's now just a piece of the application
>>> infrastructure until some other application activity schedules away from
>>> that thread (eg: by ContinueWith or some other async operation in the
>>> application code that releases the thread). To be fair, beyond structures
>>> like ContinueWith(), it is not obvious how that continuation thread
>>> should
>>> be handed back. This will be the same behaviour for dedicated
>>> continuation
>>> pools, but with far less risk in the absence of ContinueWith()
>>> constructs.
>>>
>>> In the .Net world this is becoming more of an issue as fewer .Net use
>>> cases
>>> outside of UI bother with synchronization contexts by default.
>>>
>>> Raymond.
>>>
>>>
>>> On Thu, Mar 18, 2021 at 9:56 AM Valentin Kulichenko <
>>> valentin.kuliche...@gmail.com> wrote:
>>>
>>> > Hi Denis,
>>> >
>>> > I think Pavel's main point is that behavior is unpredictable. For
>>> example,
>>> > AFAIK, putAsync can be executed in the main thread instead of the
>>> striped
>>> > pool thread if the operation is local. The listener can also be
>>> executed in
>>> > the main thread - this happens if the future is completed prior to
>>> listener
>>> > invocation (this is actually quite possible in the unit test
>>> environment
>>> > causing the test to pass). Finally, I'm pretty sure there are many
>>> cases
>>> > when a deadlock does not occur right away, but instead it will reveal
>>> > itself under high load due to thread starvation. The latter type of
>>> issues
>>> > is the most dangerous because they are often reproduced only in
>>> production.
>>> > Finally, there are performance considerations as well - cache
>>> operations
>>> > and listeners share the same fixed-sized pool which can have negative
>>> > effects.
>>> >
>>> > I'm OK with the change. Although, it might be better to introduce a new
>>> > fixed-sized pool instead of ForkJoinPool for listeners, simply because
>>> this
>>> > is the approach taken throughout the project. But this is up to a
>>> debate.
>>> >
>>> > -Val
>>> >
>>> > On Wed, Mar 17, 2021 at 11:31 AM Denis Garus <garus....@gmail.com>
>>> wrote:
>>> >
>>> > > Pavel,
>>> > > I tried this:
>>> > >
>>> > > @Test
>>> > > public void test() throws Exception {
>>> > >     IgniteCache<Integer, String> cache =
>>> > > startGrid().getOrCreateCache("test_cache");
>>> > >
>>> > >     cache.putAsync(1, "one").listen(f -> cache.replace(1, "two"));
>>> > >
>>> > >     assertEquals("two", cache.get(1));
>>> > > }
>>> > >
>>> > > and this test is green.
>>> > > I believe that an user can make listener that leads to deadlock, but
>>> > > the example in the IEP does not reflect this.
>>> > >
>>> > >
>>> > >
>>> > > ср, 17 мар. 2021 г. в 17:36, Вячеслав Коптилин <
>>> slava.kopti...@gmail.com
>>> > >:
>>> > >
>>> > > > Hi Pavel,
>>> > > >
>>> > > > > Not a good excuse really. We have a usability problem, you have
>>> to
>>> > > admit
>>> > > > it.
>>> > > > Fair enough. I agree that this is a usability issue, but I have
>>> doubts
>>> > > that
>>> > > > the proposed approach to overcome it is the best one.
>>> > > >
>>> > > > > Documentation won't help - no one is going to read the Javadoc
>>> for a
>>> > > > trivial method like putAsync
>>> > > > That is sad... However, I don't think that this is a strong
>>> argument
>>> > > here.
>>> > > >
>>> > > > This is just my opinion. Let's see what other community members
>>> have to
>>> > > > say.
>>> > > >
>>> > > > Thanks,
>>> > > > S.
>>> > > >
>>> > > >
>>> > > > ср, 17 мар. 2021 г. в 17:01, Pavel Tupitsyn <ptupit...@apache.org
>>> >:
>>> > > >
>>> > > > > > the user should use the right API
>>> > > > >
>>> > > > > Not a good excuse really. We have a usability problem, you have
>>> to
>>> > > admit
>>> > > > > it.
>>> > > > > "The brakes did not work on your car - too bad, you should have
>>> known
>>> > > > that
>>> > > > > on Sundays only your left foot is allowed on the pedal"
>>> > > > >
>>> > > > > This particular use case is too intricate.
>>> > > > > Even when you know about that, it is difficult to decide what
>>> can run
>>> > > on
>>> > > > > the striped pool,
>>> > > > > and what can't. It is too easy to forget.
>>> > > > > And most people don't know, even among Ignite developers.
>>> > > > >
>>> > > > > Documentation won't help - no one is going to read the Javadoc
>>> for a
>>> > > > > trivial method like putAsync.
>>> > > > >
>>> > > > >
>>> > > > > So I propose to have a safe default.
>>> > > > > Then document the performance tuning opportunity on [1].
>>> > > > >
>>> > > > > Think about how many users abandon a product because it
>>> mysteriously
>>> > > > > crashes and hangs.
>>> > > > >
>>> > > > > [1]
>>> > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://ignite.apache.org/docs/latest/perf-and-troubleshooting/general-perf-tips
>>> > > > >
>>> > > > >
>>> > > > > On Wed, Mar 17, 2021 at 4:21 PM Вячеслав Коптилин <
>>> > > > > slava.kopti...@gmail.com>
>>> > > > > wrote:
>>> > > > >
>>> > > > > > Hi Pavel,
>>> > > > > >
>>> > > > > > Well, I think that the user should use the right API instead of
>>> > > > > introducing
>>> > > > > > uncontested overhead for everyone.
>>> > > > > > For instance, the code that is provided by IEP can changed as
>>> > > follows:
>>> > > > > >
>>> > > > > > IgniteFuture fut = cache.putAsync(1, 1);
>>> > > > > > fut.listenAync(f -> {
>>> > > > > >     // Executes on Striped pool and deadlocks.
>>> > > > > >     cache.replace(1, 2);
>>> > > > > > }, ForkJoinPool.commonPool());
>>> > > > > >
>>> > > > > > Of course, it does not mean that this fact should not be
>>> properly
>>> > > > > > documented.
>>> > > > > > Perhaps, I am missing something.
>>> > > > > >
>>> > > > > > Thanks,
>>> > > > > > S.
>>> > > > > >
>>> > > > > > ср, 17 мар. 2021 г. в 16:01, Pavel Tupitsyn <
>>> ptupit...@apache.org
>>> > >:
>>> > > > > >
>>> > > > > > > Slava,
>>> > > > > > >
>>> > > > > > > Your suggestion is to keep things as is and discard the IEP,
>>> > right?
>>> > > > > > >
>>> > > > > > > >  this can lead to significant overhead
>>> > > > > > > Yes, there is some overhead, but the cost of accidentally
>>> > starving
>>> > > > the
>>> > > > > > > striped pool is worse,
>>> > > > > > > not to mention the deadlocks.
>>> > > > > > >
>>> > > > > > > I believe that we should favor correctness over performance
>>> in
>>> > any
>>> > > > > case.
>>> > > > > > >
>>> > > > > > >
>>> > > > > > > On Wed, Mar 17, 2021 at 3:34 PM Вячеслав Коптилин <
>>> > > > > > > slava.kopti...@gmail.com>
>>> > > > > > > wrote:
>>> > > > > > >
>>> > > > > > > > Well, the specified method already exists :)
>>> > > > > > > >
>>> > > > > > > >     /**
>>> > > > > > > >      * Registers listener closure to be asynchronously
>>> notified
>>> > > > > > whenever
>>> > > > > > > > future completes.
>>> > > > > > > >      * Closure will be processed in specified executor.
>>> > > > > > > >      *
>>> > > > > > > >      * @param lsnr Listener closure to register. Cannot be
>>> > {@code
>>> > > > > > null}.
>>> > > > > > > >      * @param exec Executor to run listener. Cannot be
>>> {@code
>>> > > > null}.
>>> > > > > > > >      */
>>> > > > > > > >     public void listenAsync(IgniteInClosure<? super
>>> > > > IgniteFuture<V>>
>>> > > > > > > lsnr,
>>> > > > > > > > Executor exec);
>>> > > > > > > >
>>> > > > > > > > Thanks,
>>> > > > > > > > S.
>>> > > > > > > >
>>> > > > > > > > ср, 17 мар. 2021 г. в 15:25, Вячеслав Коптилин <
>>> > > > > > slava.kopti...@gmail.com
>>> > > > > > > >:
>>> > > > > > > >
>>> > > > > > > > > Hello Pavel,
>>> > > > > > > > >
>>> > > > > > > > > I took a look at your IEP and pool request. I have the
>>> > > following
>>> > > > > > > > concerns.
>>> > > > > > > > > First of all, this change breaks the contract of
>>> > > > > > > > IgniteFuture#listen(lsnr)
>>> > > > > > > > >
>>> > > > > > > > >     /**
>>> > > > > > > > >      * Registers listener closure to be asynchronously
>>> > notified
>>> > > > > > > whenever
>>> > > > > > > > > future completes.
>>> > > > > > > > >      * Closure will be processed in thread that completes
>>> > this
>>> > > > > future
>>> > > > > > > or
>>> > > > > > > > > (if future already
>>> > > > > > > > >      * completed) immediately in current thread.
>>> > > > > > > > >      *
>>> > > > > > > > >      * @param lsnr Listener closure to register. Cannot
>>> be
>>> > > {@code
>>> > > > > > > null}.
>>> > > > > > > > >      */
>>> > > > > > > > >     public void listen(IgniteInClosure<? super
>>> > IgniteFuture<V>>
>>> > > > > > lsnr);
>>> > > > > > > > >
>>> > > > > > > > >     In your pull request, the listener is always called
>>> from
>>> > a
>>> > > > > > > specified
>>> > > > > > > > > thread pool (which is fork-join by default)
>>> > > > > > > > >     even though the future is already completed at the
>>> moment
>>> > > the
>>> > > > > > > listen
>>> > > > > > > > > method is called.
>>> > > > > > > > >     In my opinion, this can lead to significant overhead
>>> -
>>> > > > > submission
>>> > > > > > > > > requires acquiring a lock and notifying a pool thread.
>>> > > > > > > > >
>>> > > > > > > > >     It seems to me, that we should not change the current
>>> > > > behavior.
>>> > > > > > > > > However, thread pool executor can be added as an optional
>>> > > > parameter
>>> > > > > > of
>>> > > > > > > > > listen() method as follows:
>>> > > > > > > > >
>>> > > > > > > > >         public void listen(IgniteInClosure<? super
>>> > > > IgniteFuture<V>>
>>> > > > > > > lsnr,
>>> > > > > > > > > Executor exec);
>>> > > > > > > > >
>>> > > > > > > > > Thanks,
>>> > > > > > > > > S.
>>> > > > > > > > >
>>> > > > > > > > > пн, 15 мар. 2021 г. в 19:24, Pavel Tupitsyn <
>>> > > > ptupit...@apache.org
>>> > > > > >:
>>> > > > > > > > >
>>> > > > > > > > >> Igniters,
>>> > > > > > > > >>
>>> > > > > > > > >> Please review the IEP [1] and let me know your thoughts.
>>> > > > > > > > >>
>>> > > > > > > > >> [1]
>>> > > > > > > > >>
>>> > > > > > > > >>
>>> > > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-70%3A+Async+Continuation+Executor
>>> > > > > > > > >>
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>>
>>> --
>>> <http://www.trimble.com/>
>>> Raymond Wilson
>>> Solution Architect, Civil Construction Software Systems (CCSS)
>>> 11 Birmingham Drive | Christchurch, New Zealand
>>> raymond_wil...@trimble.com
>>>
>>> <
>>> https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch
>>> >
>>>
>>

Reply via email to