Re: Add emergency node closing handler to public Ignite API

2017-11-15 Thread Anton Vinogradov
Vova, I'll refactor IEP-7 [1], most likely merge it with IEP-5 [2], and let you know that overall design ready and clear :) [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-7%3A+Ignite+internal+problems+detection#IEP-7:Igniteinternalproblemsdetection-SystemThreadRegestry . [2] https://c

Re: Add emergency node closing handler to public Ignite API

2017-11-15 Thread Vladimir Ozerov
It would be nice to see the whole design first before going into low-level details. Without it we are jumping from topic to topic. Were the list events and reaction to these events discussed previously? At this point it is not clear why nodes should be forcefully stopped without any alternative. F

Re: Add emergency node closing handler to public Ignite API

2017-11-15 Thread Anton Vinogradov
According to [1] Reasons are: - IgniteOutOfMemoryException - Persistence errors - ExchangeWorker exits with error [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-7%3A+Ignite+internal+problems+detection On Wed, Nov 15, 2017 at 2:24 PM, Vladimir Ozerov wrote: > I am not quite I unders

Re: Add emergency node closing handler to public Ignite API

2017-11-15 Thread Vladimir Ozerov
I am not quite I understand how tasks are split. How can we discuss graceful shutdown without discussing the reasons of this shutdown? What leads to it? On Wed, Nov 15, 2017 at 2:10 PM, Anton Vinogradov wrote: > Vova, > > Currently we have a lot IEPs to improve grid monitoring and behavior. > >

Re: Add emergency node closing handler to public Ignite API

2017-11-15 Thread Anton Vinogradov
Vova, Currently we have a lot IEPs to improve grid monitoring and behavior. Let's split tasks to: 1) Graceful shutdown. In this case we'd like to provide user ability to do something, LifecycleBean is what we looking for, thanks for tips! But, we have to keep shutdown reason somewhere. In case y

Re: Add emergency node closing handler to public Ignite API

2017-11-15 Thread Vladimir Ozerov
AFAIK the idea was not only to shutdown the node, but also to give user (e.g. administrator) ability to observe the problem from the outside, e.g. through JMX. E.g. if we detect Java-level deadlock, it doesn't mean that the only possible solution is node shutdown. In addition it could be no-op, e.g

Re: Add emergency node closing handler to public Ignite API

2017-11-15 Thread Anton Vinogradov
Vova, Could you point to metric you're talking about? On Wed, Nov 15, 2017 at 1:06 PM, Andrey Kuznetsov wrote: > Vladimir, > > Could you please refine, what are local metrics? Should I extend Ignite > interface by adding something similar to dataRegionMetrics() or there is > some universal mech

Re: Add emergency node closing handler to public Ignite API

2017-11-15 Thread Andrey Kuznetsov
Vladimir, Could you please refine, what are local metrics? Should I extend Ignite interface by adding something similar to dataRegionMetrics() or there is some universal mechanism to handle metrics? 2017-11-15 8:30 GMT+03:00 Vladimir Ozerov : > > This information should be available through local

Re: Add emergency node closing handler to public Ignite API

2017-11-14 Thread Vladimir Ozerov
This information should be available through local metrics, so that it is accessible from Ignite instance. вт, 14 нояб. 2017 г. в 22:37, Valentin Kulichenko < valentin.kuliche...@gmail.com>: > Andrey, > > Then let's add API to get this information. There is no need to add another > callback as we

Re: Add emergency node closing handler to public Ignite API

2017-11-14 Thread Valentin Kulichenko
Andrey, Then let's add API to get this information. There is no need to add another callback as we already have one. -Val On Tue, Nov 14, 2017 at 11:34 AM, Andrey Kuznetsov wrote: > Vladimir, Ignite instance won't tell me whether deadlock occurred or some > critical thread has died. > > 14 ноя

Re: Add emergency node closing handler to public Ignite API

2017-11-14 Thread Andrey Kuznetsov
Vladimir, Ignite instance won't tell me whether deadlock occurred or some critical thread has died. 14 нояб. 2017 г. 22:28 пользователь "Vladimir Ozerov" написал: You can get this info from injected Ignite instance.

Re: Add emergency node closing handler to public Ignite API

2017-11-14 Thread Vladimir Ozerov
You can get this info from injected Ignite instance. On Tue, Nov 14, 2017 at 10:13 PM, Andrey Kuznetsov wrote: > Lifecycle beans are ok, but they does not accept any info on the Reason in > case of emergency node stop. > > 2017-11-14 21:16 GMT+03:00 Valentin Kulichenko < > valentin.kuliche...@gm

Re: Add emergency node closing handler to public Ignite API

2017-11-14 Thread Andrey Kuznetsov
Lifecycle beans are ok, but they does not accept any info on the Reason in case of emergency node stop. 2017-11-14 21:16 GMT+03:00 Valentin Kulichenko < valentin.kuliche...@gmail.com>: > Anton, > > I agree with Vova - we already have lifecycle bean. Why do we need anything > on top of that? > > -

Re: Add emergency node closing handler to public Ignite API

2017-11-14 Thread Valentin Kulichenko
Anton, I agree with Vova - we already have lifecycle bean. Why do we need anything on top of that? -Val On Tue, Nov 14, 2017 at 10:05 AM, Anton Vinogradov wrote: > Vova, > > We should provide user ability to be notified in case some node decided to > stop itself. > Only user know how he want t

Re: Add emergency node closing handler to public Ignite API

2017-11-14 Thread Anton Vinogradov
Vova, We should provide user ability to be notified in case some node decided to stop itself. Only user know how he want to be notified, so we should provide ability to register custom callback(eg. send sms or call rest service) This will cover cases when node stops gracefuly. Please, see Semen's

Re: Add emergency node closing handler to public Ignite API

2017-11-14 Thread Vladimir Ozerov
Can you explain what kind of logic could be placed there? And why do we need another configuration property and/or interface? We already have LifecycleBean, where Ignite instance could be injected, so user is already able to perform anything there. On Tue, Nov 14, 2017 at 7:46 PM, Anton Vinogradov

Re: Add emergency node closing handler to public Ignite API

2017-11-14 Thread Anton Vinogradov
Vova, That's not about "kill -9" or OOM, that's about case when node detected something and decided to stop itself (eg. persistence errors, IgniteOutOfMemoryException, ExchangeWorker died) Sure, we can't handle OOM or 100% CPU utilization by GC it that way, but we can handle some logical problems.

Re: Add emergency node closing handler to public Ignite API

2017-11-14 Thread Vladimir Ozerov
I am not sure this makes sense. First, in general case we do not have access to Java. E.g. in case of very long GC pause all Java threads are stuck and it is impossible to invoke anything. Second, some other conditions may be unrecoverable, such as OOME, where there is no guarantee that any operati

Add emergency node closing handler to public Ignite API

2017-11-14 Thread Andrey Kuznetsov
Hi Igniters! When some node detects critical error, e.g. OOME, deadlock, etc, it should invoke some user-defined callback and then attempt to close itself gracefully. In order to make this possible we need to enhance Ignite interface by adding something like Ignite.onEmergencyClose(SomeClosure).