Pavel, The change still looks a bit risky to me, because the default executor is set to commonPool and can alter existing deployments in unpredictable ways, if commonPool is heavily used for other purposes.
Runnable::run usage is not obvious as well and should be properly documented as a way to return to old behavior. I'm not sure we need it in 2.X for the reasons above - we already have ways to control a listener's behavior - it's a matter of good documentation to me. чт, 25 мар. 2021 г. в 15:33, Pavel Tupitsyn <ptupit...@apache.org>: > Alexei, > > > Sometimes it's more desirable to execute the listener in the same thread > > It's up to the user to decide. > > Yes, we give users a choice to configure the executor as Runnable::run and > use the same thread if needed. > However, it should not be the default behavior as explained above (bad > usability, unexpected major issues). > > On Thu, Mar 25, 2021 at 3:06 PM Alexei Scherbakov < > alexey.scherbak...@gmail.com> wrote: > > > Pavel, > > > > While I understand the issue and overall agree with you, I'm against the > > execution of listeners in separate thread pool by default. > > > > Sometimes it's more desirable to execute the listener in the same thread, > > for example if it's some lightweight closure. > > > > It's up to the user to decide. > > > > I think the IgniteFuture.listen method should be properly documented to > > avoid execution of cluster operations or any other potentially blocking > > operations inside the listener. > > > > Otherwise listenAsync should be used. > > > > > > > > чт, 25 мар. 2021 г. в 14:04, Pavel Tupitsyn <ptupit...@apache.org>: > > > > > Stan, > > > > > > We have thread pools dedicated for specific purposes, like cache > > (striped), > > > compute (pub), query, etc > > > As I understand it, the reason here is to limit the number of threads > > > dedicated to a given subsystem. > > > For example, Compute may be overloaded with work, but Cache and > Discovery > > > will keep going. > > > > > > This is different from async continuations, which are arbitrary user > > code. > > > So what is the benefit of having a new user pool for arbitrary code > that > > is > > > probably not related to Ignite at all? > > > > > > On Thu, Mar 25, 2021 at 1:31 PM <stanlukya...@gmail.com> wrote: > > > > > > > Pavel, > > > > > > > > This is a great work, fully agree with the overall idea and approach. > > > > > > > > However, I have some reservations about the API. We sure do have a > lot > > of > > > > async stuff in the system, and I would suggest to stick to the usual > > > design > > > > - create a separate thread pool, add a single property to control the > > > size > > > > of the pool. > > > > Alternatively, we may consider using public pool for that. May I ask > if > > > > there is an example use case which doesn’t work with public pool? > > > > > > > > For .NET, agree that we should follow the rules and APIs of the > > platform, > > > > so the behavior might slightly differ. > > > > > > > > Thanks, > > > > Stan > > > > > > > > > On 24 Mar 2021, at 09:52, Pavel Tupitsyn <ptupit...@apache.org> > > wrote: > > > > > > > > > > Igniters, since there are no more comments and/or review feedback, > > > > > I'm going to merge the changes by the end of the week. > > > > > > > > > >> On Mon, Mar 22, 2021 at 10:37 PM Pavel Tupitsyn < > > ptupit...@apache.org > > > > > > > > >> wrote: > > > > >> > > > > >> Ready for review: > > > > >> https://github.com/apache/ignite/pull/8870 > > > > >> > > > > >> On Sun, Mar 21, 2021 at 8:09 PM Pavel Tupitsyn < > > ptupit...@apache.org> > > > > >> wrote: > > > > >> > > > > >>> Simple benchmark added - see JmhCacheAsyncListenBenchmark in the > > PR. > > > > >>> There is a 6-8% drop (1 client, 2 servers, 1 machine, int > key/val). > > > > >>> I expect this difference to become barely observable on > real-world > > > > >>> workloads. > > > > >>> > > > > >>> On Thu, Mar 18, 2021 at 12:35 PM Pavel Tupitsyn < > > > ptupit...@apache.org> > > > > >>> wrote: > > > > >>> > > > > >>>> Denis, > > > > >>>> > > > > >>>> For a reproducer, please see > > CacheAsyncContinuationExecutorTest.java > > > > in > > > > >>>> the linked PoC [1] > > > > >>>> > > > > >>>> [1] > > > > >>>> > > > > > > > > > > https://github.com/apache/ignite/pull/8870/files#diff-c788c12013622093df07d8f628a6e8c5fb0c15007f0787f2d459dbb3e377fc5aR54 > > > > >>>> > > > > >>>> On Thu, Mar 18, 2021 at 1:58 AM Raymond Wilson < > > > > >>>> raymond_wil...@trimble.com> wrote: > > > > >>>> > > > > >>>>> We implemented the ContinueWith() suggestion from Pavel, and it > > > works > > > > >>>>> well > > > > >>>>> so far in testing, though we do not have data to support if > there > > > is > > > > or > > > > >>>>> is > > > > >>>>> not a performance penalty in our use case.. > > > > >>>>> > > > > >>>>> To lend another vote to the 'Don't do continuations on the > > striped > > > > >>>>> thread > > > > >>>>> pool' line of thinking: Deadlocking is an issue as is > starvation. > > > In > > > > >>>>> some > > > > >>>>> ways starvation is more insidious because by the time things > stop > > > > >>>>> working > > > > >>>>> the cause and effect distance may be large. > > > > >>>>> > > > > >>>>> I appreciate the documentation does make statements about not > > > > performing > > > > >>>>> cache operations in a continuation due to deadlock > possibilities, > > > but > > > > >>>>> that > > > > >>>>> statement does not reveal why this is an issue. It is less a > case > > > of > > > > a > > > > >>>>> async cache operation being followed by some other cache > > operation > > > > (an > > > > >>>>> immediate issue), and more a general case of the continuation > of > > > > >>>>> application logic using a striped pool thread in a way that > might > > > > mean > > > > >>>>> that > > > > >>>>> thread is never given back - it's now just a piece of the > > > application > > > > >>>>> infrastructure until some other application activity schedules > > away > > > > from > > > > >>>>> that thread (eg: by ContinueWith or some other async operation > in > > > the > > > > >>>>> application code that releases the thread). To be fair, beyond > > > > >>>>> structures > > > > >>>>> like ContinueWith(), it is not obvious how that continuation > > thread > > > > >>>>> should > > > > >>>>> be handed back. This will be the same behaviour for dedicated > > > > >>>>> continuation > > > > >>>>> pools, but with far less risk in the absence of ContinueWith() > > > > >>>>> constructs. > > > > >>>>> > > > > >>>>> In the .Net world this is becoming more of an issue as fewer > .Net > > > use > > > > >>>>> cases > > > > >>>>> outside of UI bother with synchronization contexts by default. > > > > >>>>> > > > > >>>>> Raymond. > > > > >>>>> > > > > >>>>> > > > > >>>>> On Thu, Mar 18, 2021 at 9:56 AM Valentin Kulichenko < > > > > >>>>> valentin.kuliche...@gmail.com> wrote: > > > > >>>>> > > > > >>>>>> Hi Denis, > > > > >>>>>> > > > > >>>>>> I think Pavel's main point is that behavior is unpredictable. > > For > > > > >>>>> example, > > > > >>>>>> AFAIK, putAsync can be executed in the main thread instead of > > the > > > > >>>>> striped > > > > >>>>>> pool thread if the operation is local. The listener can also > be > > > > >>>>> executed in > > > > >>>>>> the main thread - this happens if the future is completed > prior > > to > > > > >>>>> listener > > > > >>>>>> invocation (this is actually quite possible in the unit test > > > > >>>>> environment > > > > >>>>>> causing the test to pass). Finally, I'm pretty sure there are > > many > > > > >>>>> cases > > > > >>>>>> when a deadlock does not occur right away, but instead it will > > > > reveal > > > > >>>>>> itself under high load due to thread starvation. The latter > type > > > of > > > > >>>>> issues > > > > >>>>>> is the most dangerous because they are often reproduced only > in > > > > >>>>> production. > > > > >>>>>> Finally, there are performance considerations as well - cache > > > > >>>>> operations > > > > >>>>>> and listeners share the same fixed-sized pool which can have > > > > negative > > > > >>>>>> effects. > > > > >>>>>> > > > > >>>>>> I'm OK with the change. Although, it might be better to > > introduce > > > a > > > > >>>>> new > > > > >>>>>> fixed-sized pool instead of ForkJoinPool for listeners, simply > > > > >>>>> because this > > > > >>>>>> is the approach taken throughout the project. But this is up > to > > a > > > > >>>>> debate. > > > > >>>>>> > > > > >>>>>> -Val > > > > >>>>>> > > > > >>>>>> On Wed, Mar 17, 2021 at 11:31 AM Denis Garus < > > garus....@gmail.com > > > > > > > > >>>>> wrote: > > > > >>>>>> > > > > >>>>>>> Pavel, > > > > >>>>>>> I tried this: > > > > >>>>>>> > > > > >>>>>>> @Test > > > > >>>>>>> public void test() throws Exception { > > > > >>>>>>> IgniteCache<Integer, String> cache = > > > > >>>>>>> startGrid().getOrCreateCache("test_cache"); > > > > >>>>>>> > > > > >>>>>>> cache.putAsync(1, "one").listen(f -> cache.replace(1, > > "two")); > > > > >>>>>>> > > > > >>>>>>> assertEquals("two", cache.get(1)); > > > > >>>>>>> } > > > > >>>>>>> > > > > >>>>>>> and this test is green. > > > > >>>>>>> I believe that an user can make listener that leads to > > deadlock, > > > > but > > > > >>>>>>> the example in the IEP does not reflect this. > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> ср, 17 мар. 2021 г. в 17:36, Вячеслав Коптилин < > > > > >>>>> slava.kopti...@gmail.com > > > > >>>>>>> : > > > > >>>>>>> > > > > >>>>>>>> Hi Pavel, > > > > >>>>>>>> > > > > >>>>>>>>> Not a good excuse really. We have a usability problem, you > > have > > > > >>>>> to > > > > >>>>>>> admit > > > > >>>>>>>> it. > > > > >>>>>>>> Fair enough. I agree that this is a usability issue, but I > > have > > > > >>>>> doubts > > > > >>>>>>> that > > > > >>>>>>>> the proposed approach to overcome it is the best one. > > > > >>>>>>>> > > > > >>>>>>>>> Documentation won't help - no one is going to read the > > Javadoc > > > > >>>>> for a > > > > >>>>>>>> trivial method like putAsync > > > > >>>>>>>> That is sad... However, I don't think that this is a strong > > > > >>>>> argument > > > > >>>>>>> here. > > > > >>>>>>>> > > > > >>>>>>>> This is just my opinion. Let's see what other community > > members > > > > >>>>> have to > > > > >>>>>>>> say. > > > > >>>>>>>> > > > > >>>>>>>> Thanks, > > > > >>>>>>>> S. > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>>>>>>> ср, 17 мар. 2021 г. в 17:01, Pavel Tupitsyn < > > > ptupit...@apache.org > > > > >>>>>> : > > > > >>>>>>>> > > > > >>>>>>>>>> the user should use the right API > > > > >>>>>>>>> > > > > >>>>>>>>> Not a good excuse really. We have a usability problem, you > > have > > > > >>>>> to > > > > >>>>>>> admit > > > > >>>>>>>>> it. > > > > >>>>>>>>> "The brakes did not work on your car - too bad, you should > > have > > > > >>>>> known > > > > >>>>>>>> that > > > > >>>>>>>>> on Sundays only your left foot is allowed on the pedal" > > > > >>>>>>>>> > > > > >>>>>>>>> This particular use case is too intricate. > > > > >>>>>>>>> Even when you know about that, it is difficult to decide > what > > > > >>>>> can run > > > > >>>>>>> on > > > > >>>>>>>>> the striped pool, > > > > >>>>>>>>> and what can't. It is too easy to forget. > > > > >>>>>>>>> And most people don't know, even among Ignite developers. > > > > >>>>>>>>> > > > > >>>>>>>>> Documentation won't help - no one is going to read the > > Javadoc > > > > >>>>> for a > > > > >>>>>>>>> trivial method like putAsync. > > > > >>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>>> So I propose to have a safe default. > > > > >>>>>>>>> Then document the performance tuning opportunity on [1]. > > > > >>>>>>>>> > > > > >>>>>>>>> Think about how many users abandon a product because it > > > > >>>>> mysteriously > > > > >>>>>>>>> crashes and hangs. > > > > >>>>>>>>> > > > > >>>>>>>>> [1] > > > > >>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>> > > > > >>>>>>> > > > > >>>>>> > > > > >>>>> > > > > > > > > > > https://ignite.apache.org/docs/latest/perf-and-troubleshooting/general-perf-tips > > > > >>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>>> On Wed, Mar 17, 2021 at 4:21 PM Вячеслав Коптилин < > > > > >>>>>>>>> slava.kopti...@gmail.com> > > > > >>>>>>>>> wrote: > > > > >>>>>>>>> > > > > >>>>>>>>>> Hi Pavel, > > > > >>>>>>>>>> > > > > >>>>>>>>>> Well, I think that the user should use the right API > instead > > > > >>>>> of > > > > >>>>>>>>> introducing > > > > >>>>>>>>>> uncontested overhead for everyone. > > > > >>>>>>>>>> For instance, the code that is provided by IEP can changed > > as > > > > >>>>>>> follows: > > > > >>>>>>>>>> > > > > >>>>>>>>>> IgniteFuture fut = cache.putAsync(1, 1); > > > > >>>>>>>>>> fut.listenAync(f -> { > > > > >>>>>>>>>> // Executes on Striped pool and deadlocks. > > > > >>>>>>>>>> cache.replace(1, 2); > > > > >>>>>>>>>> }, ForkJoinPool.commonPool()); > > > > >>>>>>>>>> > > > > >>>>>>>>>> Of course, it does not mean that this fact should not be > > > > >>>>> properly > > > > >>>>>>>>>> documented. > > > > >>>>>>>>>> Perhaps, I am missing something. > > > > >>>>>>>>>> > > > > >>>>>>>>>> Thanks, > > > > >>>>>>>>>> S. > > > > >>>>>>>>>> > > > > >>>>>>>>>> ср, 17 мар. 2021 г. в 16:01, Pavel Tupitsyn < > > > > >>>>> ptupit...@apache.org > > > > >>>>>>> : > > > > >>>>>>>>>> > > > > >>>>>>>>>>> Slava, > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> Your suggestion is to keep things as is and discard the > > IEP, > > > > >>>>>> right? > > > > >>>>>>>>>>> > > > > >>>>>>>>>>>> this can lead to significant overhead > > > > >>>>>>>>>>> Yes, there is some overhead, but the cost of accidentally > > > > >>>>>> starving > > > > >>>>>>>> the > > > > >>>>>>>>>>> striped pool is worse, > > > > >>>>>>>>>>> not to mention the deadlocks. > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> I believe that we should favor correctness over > performance > > > > >>>>> in > > > > >>>>>> any > > > > >>>>>>>>> case. > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> On Wed, Mar 17, 2021 at 3:34 PM Вячеслав Коптилин < > > > > >>>>>>>>>>> slava.kopti...@gmail.com> > > > > >>>>>>>>>>> wrote: > > > > >>>>>>>>>>> > > > > >>>>>>>>>>>> Well, the specified method already exists :) > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> /** > > > > >>>>>>>>>>>> * Registers listener closure to be asynchronously > > > > >>>>> notified > > > > >>>>>>>>>> whenever > > > > >>>>>>>>>>>> future completes. > > > > >>>>>>>>>>>> * Closure will be processed in specified executor. > > > > >>>>>>>>>>>> * > > > > >>>>>>>>>>>> * @param lsnr Listener closure to register. Cannot > be > > > > >>>>>> {@code > > > > >>>>>>>>>> null}. > > > > >>>>>>>>>>>> * @param exec Executor to run listener. Cannot be > > > > >>>>> {@code > > > > >>>>>>>> null}. > > > > >>>>>>>>>>>> */ > > > > >>>>>>>>>>>> public void listenAsync(IgniteInClosure<? super > > > > >>>>>>>> IgniteFuture<V>> > > > > >>>>>>>>>>> lsnr, > > > > >>>>>>>>>>>> Executor exec); > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> Thanks, > > > > >>>>>>>>>>>> S. > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> ср, 17 мар. 2021 г. в 15:25, Вячеслав Коптилин < > > > > >>>>>>>>>> slava.kopti...@gmail.com > > > > >>>>>>>>>>>> : > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>>> Hello Pavel, > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> I took a look at your IEP and pool request. I have the > > > > >>>>>>> following > > > > >>>>>>>>>>>> concerns. > > > > >>>>>>>>>>>>> First of all, this change breaks the contract of > > > > >>>>>>>>>>>> IgniteFuture#listen(lsnr) > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> /** > > > > >>>>>>>>>>>>> * Registers listener closure to be asynchronously > > > > >>>>>> notified > > > > >>>>>>>>>>> whenever > > > > >>>>>>>>>>>>> future completes. > > > > >>>>>>>>>>>>> * Closure will be processed in thread that > > > > >>>>> completes > > > > >>>>>> this > > > > >>>>>>>>> future > > > > >>>>>>>>>>> or > > > > >>>>>>>>>>>>> (if future already > > > > >>>>>>>>>>>>> * completed) immediately in current thread. > > > > >>>>>>>>>>>>> * > > > > >>>>>>>>>>>>> * @param lsnr Listener closure to register. Cannot > > > > >>>>> be > > > > >>>>>>> {@code > > > > >>>>>>>>>>> null}. > > > > >>>>>>>>>>>>> */ > > > > >>>>>>>>>>>>> public void listen(IgniteInClosure<? super > > > > >>>>>> IgniteFuture<V>> > > > > >>>>>>>>>> lsnr); > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> In your pull request, the listener is always called > > > > >>>>> from > > > > >>>>>> a > > > > >>>>>>>>>>> specified > > > > >>>>>>>>>>>>> thread pool (which is fork-join by default) > > > > >>>>>>>>>>>>> even though the future is already completed at the > > > > >>>>> moment > > > > >>>>>>> the > > > > >>>>>>>>>>> listen > > > > >>>>>>>>>>>>> method is called. > > > > >>>>>>>>>>>>> In my opinion, this can lead to significant > > > > >>>>> overhead - > > > > >>>>>>>>> submission > > > > >>>>>>>>>>>>> requires acquiring a lock and notifying a pool thread. > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> It seems to me, that we should not change the > > > > >>>>> current > > > > >>>>>>>> behavior. > > > > >>>>>>>>>>>>> However, thread pool executor can be added as an > > > > >>>>> optional > > > > >>>>>>>> parameter > > > > >>>>>>>>>> of > > > > >>>>>>>>>>>>> listen() method as follows: > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> public void listen(IgniteInClosure<? super > > > > >>>>>>>> IgniteFuture<V>> > > > > >>>>>>>>>>> lsnr, > > > > >>>>>>>>>>>>> Executor exec); > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> Thanks, > > > > >>>>>>>>>>>>> S. > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> пн, 15 мар. 2021 г. в 19:24, Pavel Tupitsyn < > > > > >>>>>>>> ptupit...@apache.org > > > > >>>>>>>>>> : > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> Igniters, > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> Please review the IEP [1] and let me know your > > > > >>>>> thoughts. > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> [1] > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>> > > > > >>>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>> > > > > >>>>>>> > > > > >>>>>> > > > > >>>>> > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-70%3A+Async+Continuation+Executor > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>> > > > > >>>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>> > > > > >>>>>>> > > > > >>>>>> > > > > >>>>> > > > > >>>>> > > > > >>>>> -- > > > > >>>>> <http://www.trimble.com/> > > > > >>>>> Raymond Wilson > > > > >>>>> Solution Architect, Civil Construction Software Systems (CCSS) > > > > >>>>> 11 Birmingham Drive | Christchurch, New Zealand > > > > >>>>> raymond_wil...@trimble.com > > > > >>>>> > > > > >>>>> < > > > > >>>>> > > > > > > > > > > https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch > > > > >>>>>> > > > > >>>>> > > > > >>>> > > > > > > > > > > > > > -- > > > > Best regards, > > Alexei Scherbakov > > > -- Best regards, Alexei Scherbakov