Dmitriy.

I think you and other participants of discussion are talking about different 
cases.

May be it be usefull to look at specific cases and discuss each of them 
separately?

I look at IEP page and see following:

```
File IO errors. Usually IOException's threw by read/write operations on file 
system. The following subsystems should be considered as critical:
* WAL
* Page store
* Meta store
* Binary meta store
```

Suppose, we ran out of disk space on some node.
The other things are all right.
Should we do `System.exit(-1);` in that case?

Personally, I fully agreed with Nick Podrash: 

"I can tell you as a user that if any library I was using in my application 
called System.exit without my consent would result in a lot of frustration."

Also, do you have any examples of other products that do `System.exit(-1);` in 
case of troubles?

В Вт, 13/03/2018 в 19:07 -0400, Dmitriy Setrakyan пишет:
> On Tue, Mar 13, 2018 at 6:55 PM, Dmitry Pavlov <dpavlov....@gmail.com>
> wrote:
> 
> > What do you think if stop is default for all cases?
> > 
> > Kill is configurable.
> > 
> > We can consider enforse sockets close for 'stop'. This will allow to ignore
> > hang node by rest of the cluster.
> > 
> 
> Dmitriy, I see that you cannot come to terms with stopping a process that
> was not started by Ignite. However, in majority of the deployments, users
> would prefer that you would "kill" the process instead of leaving it
> running in a "frozen" state. Frozen state is non-deterministic and it is
> impossible to create a recovery for it. Killing the process is very
> deterministic and can be recovered by restarting it in most cases.
> 
> "stop" does not fix the problem we are trying to solve. The whole point is
> to prevent frozen state, and "stop" without "kill" does not prevent it. I
> am OK if "stop+kill" is the default behavior, which means that we try a
> graceful shutdown and then always kill the process anyway.
> 
> I think we should have the following configurable options:
> - "stop+kill" (default)
> - "kill"
> - "stop"
> - "stop+restart" (if stop fails, we should kill regardless)

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to