Sergey, Thank you for your answer. While I am not happy with the proposed approach but things never were easy. Unfortunately I cannot suggest 100% better approaches so far. So, I should trust your vision.
2020-09-22 10:29 GMT+03:00, Sergey Chugunov <sergey.chugu...@gmail.com>: > Ivan, > > Checkpointer in Maintenance Mode is started and allows normal operations as > it may be needed for defragmentation and possibly other cases. > > Discovery is started with a special implementation of SPI that doesn't make > attempts to seek and/or connect to the rest of the cluster. From that > perspective node in MM is totally isolated. > > Communication is started as usual but I believe it doesn't matter as > discovery no other nodes are observed in topology and connection attempt > should not happen. But it may make sense to implement isolated version of > communication SPI as well to have 100% guarantee that no communication with > other nodes will happen. > > It is important to note that GridRestProcessor is started normally as we > need it to connect to the node via control utility. > > On Mon, Sep 21, 2020 at 7:04 PM Ivan Pavlukhin <vololo...@gmail.com> wrote: > >> Sergey, >> >> > From the code complexity perspective I'm trying to design the feature >> in such a way that all maintenance code is as encapsulated as possible >> and >> avoids massive interventions into main workflows of components. >> >> Could please briefly tell what means do you use to achieve >> encapsulation? Are Discovery, Communication, Checkpointer and other >> components started in a maintenance mode in current design? >> >> 2020-09-21 15:19 GMT+03:00, Nikolay Izhikov <nizhi...@apache.org>: >> > Hello, Sergey. >> > >> >> At the moment I'm aware about two use cases for this feature: >> >> corrupted >> >> PDS cleanup and defragmentation. >> > >> > AFAIKU There is third use-case for this mode. >> > >> > Change encryption master key in case node was down during cluster >> > master >> key >> > change. >> > In this case, node can’t join to the cluster, because it’s master key >> > differs from the cluster. >> > To recover node Ignite should locally change master key before join. >> > >> > Please, take a look into source code [1] >> > >> > [1] >> > >> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/managers/encryption/GridEncryptionManager.java#L710 >> > >> >> 21 сент. 2020 г., в 14:37, Sergey Chugunov <sergey.chugu...@gmail.com> >> >> написал(а): >> >> >> >> Ivan, >> >> >> >> Sorry for some confusion, MM indeed is not a normal mode. What I was >> >> trying >> >> to say is that when in MM node still starts and allows the user to >> >> perform >> >> actions with it like sending commands via control utility/JMX APIs or >> >> reading metrics. >> >> >> >> This is the key point: although the node is not in the cluster but it >> >> is >> >> still alive can be monitored and supports management to do >> >> maintenance. >> >> >> >> From the code complexity perspective I'm trying to design the feature >> in >> >> such a way that all maintenance code is as encapsulated as possible >> >> and >> >> avoids massive interventions into main workflows of components. >> >> At the moment I'm aware about two use cases for this feature: >> >> corrupted >> >> PDS >> >> cleanup and defragmentation. As far as I know it won't bring too much >> >> complexity in both cases. >> >> >> >> I cannot say for other components but I believe it will be possible to >> >> integrate MM feature into their workflow as well with reasonable >> >> amount >> >> of >> >> refactoring. >> >> >> >> Does it make sense to you? >> >> >> >> On Sun, Sep 6, 2020 at 8:08 AM Ivan Pavlukhin <vololo...@gmail.com> >> >> wrote: >> >> >> >>> Sergey, >> >>> >> >>> Thank you for your answer! >> >>> >> >>> Might be I am looking at the subject from a different angle. >> >>> >> >>>> I think of a node in MM as an almost normal one >> >>> I cannot think of such a mode as a normal one, because it apparently >> >>> does not perform usual cluster node functions. It is not a part of a >> >>> cluster, caches data is not available, Discovery and Communication >> >>> are >> >>> not needed. >> >>> >> >>> I fear that with "node started in a special mode" approach we will >> >>> get >> >>> an additional flag in the code making the code more complex and >> >>> fragile. Should not I worry about it? >> >>> >> >>> 2020-09-02 10:45 GMT+03:00, Sergey Chugunov >> >>> <sergey.chugu...@gmail.com >> >: >> >>>> Vladislav, Ivan, >> >>>> >> >>>> Thank you for your questions and suggestions. Let me answer them. >> >>>> >> >>>> Vladislav, >> >>>> >> >>>> If I understood you correctly, you're talking about a node >> >>>> performing >> >>> some >> >>>> automatic actions to fix the problem and then join the cluster as >> >>>> usual. >> >>>> >> >>>> However the original ticket [1] where we faced the need for >> Maintenance >> >>>> Mode is about exactly the opposite: avoid doing automatic actions >> >>>> and >> >>> give >> >>>> a user the ability to decide what to do. >> >>>> >> >>>> Also the idea of Maintenance Mode is that the node is able to accept >> >>>> commands, expose metrics and so on, thus we need all components to >> >>>> be >> >>>> initialized (some of them may be partially initialized due to their >> own >> >>>> maintenance). >> >>>> To achieve that we need to go through a full cycle of node >> >>>> initialization >> >>>> including discovery initialization. When discovery is initialized >> >>>> (in >> >>>> special isolated mode) I don't think it is easy to switch back to >> >>>> normal >> >>>> operations without a restart. >> >>>> >> >>>> Ivan, >> >>>> >> >>>> I think of a node in MM as an almost normal one (maybe with some >> >>> components >> >>>> skipped some steps of their initialization). Commands are accepted, >> >>>> appropriate metrics are exposed e.g. through JMX API and so on. >> >>>> >> >>>> So as I see it we'll have special commands for control.{sh|bat} CLI >> >>>> allowing user to see reasons why node switched to maintenance mode >> >>>> and/or >> >>>> trigger actions to fix the problem (I'm still thinking about proper >> >>> design >> >>>> of these actions though). >> >>>> >> >>>> Of course the user should also be able to fix the problem manually >> e.g. >> >>> by >> >>>> manually deleting corrupted PDS files when node is down. Ideally >> >>>> Maintenance Mode should be smart enough to figure that out and >> >>>> switch >> >>>> to >> >>>> normal operations without a restart but I'm not sure if it is >> >>>> possible >> >>>> without invasive changes of our components' lifecycle. >> >>>> So I believe this model (node truly started in Maintenance Mode and >> new >> >>>> commands in control.{sh|bat}) is a good fit for our current APIs and >> >>>> ways >> >>>> to interact with the node. >> >>>> >> >>>> Does it sound reasonable to you? >> >>>> >> >>>> Thank you! >> >>>> >> >>>> [1] https://issues.apache.org/jira/browse/IGNITE-13366 >> >>>> >> >>>> On Tue, Sep 1, 2020 at 2:07 PM Ivan Pavlukhin <vololo...@gmail.com> >> >>> wrote: >> >>>> >> >>>>> Sergey, >> >>>>> >> >>>>> Actually, I missed the point that the discussed mode affects a >> >>>>> single >> >>>>> node but not a whole cluster. Perhaps I mixed terms "mode" and >> >>>>> "state". >> >>>>> >> >>>>> My next thoughts about maintenance routines are about special >> >>>>> utilities. As far as I remember MySQL provides a bunch of scripts >> >>>>> for >> >>>>> various maintenance purposes. What user interface for maintenance >> >>>>> tasks execution is assumed? And what do we mean by "starting" a >> >>>>> node >> >>>>> in a maintenance mode? Can we do some routines without "starting" >> >>>>> (e.g. try to recover PDS or cleanup)? >> >>>>> >> >>>>> 2020-08-31 23:41 GMT+03:00, Vladislav Pyatkov <vldpyat...@gmail.com >> >: >> >>>>>> Hi Sergey. >> >>>>>> >> >>>>>> As I understand any switching from/to MM possible only through >> manual >> >>>>>> restart a node. >> >>>>>> But in your example that look like a technical actions, that only >> >>>>> possible >> >>>>>> in the case. >> >>>>>> Do you plan to provide a possibility for client where he can make >> >>>>>> a >> >>>>>> decision without a manual intervention? >> >>>>>> >> >>>>>> For example: Start node and manually agree with an option and >> >>>>>> after >> >>>>>> automatically resolve conflict and back to topology as a stable >> node. >> >>>>>> >> >>>>>> On Mon, Aug 31, 2020 at 5:41 PM Sergey Chugunov < >> >>>>> sergey.chugu...@gmail.com> >> >>>>>> wrote: >> >>>>>> >> >>>>>>> Hello Ivan, >> >>>>>>> >> >>>>>>> Thank you for raising the good question, I didn't think of >> >>> Maintenance >> >>>>>>> Mode >> >>>>>>> from that perspective. >> >>>>>>> >> >>>>>>> In short, Maintenance Mode isn't related to Cluster States >> >>>>>>> concept. >> >>>>>>> According to javadoc documentation of ClusterState enum [1] it is >> >>>>>>> solely >> >>>>>>> about cache operations and to some extent doesn't affect other >> >>>>> components >> >>>>>>> of Ignite node. >> >>>>>>> From APIs perspective putting the methods to manage Cluster State >> to >> >>>>>>> IgniteCluster interface doesn't look ideal to me but it is as it >> is. >> >>>>>>> >> >>>>>>> On the other hand Maintenance Mode as I see it will be managed >> >>> through >> >>>>>>> different APIs than a ClusterState and this difference definitely >> >>> will >> >>>>> be >> >>>>>>> reflected in the documentation of the feature. >> >>>>>>> >> >>>>>>> Ignite node is a complex piece of many components interacting >> >>>>>>> with >> >>>>>>> each >> >>>>>>> other, they may have different lifecycles and states; states of >> >>>>> different >> >>>>>>> components cannot be reduced to the lowest common denominator. >> >>>>>>> >> >>>>>>> However if you have an idea of how to call the feature better to >> let >> >>>>>>> the >> >>>>>>> user easier distinguish it from other similar features please >> >>>>>>> share >> >>> it >> >>>>>>> with >> >>>>>>> us. Personally I'm very welcome to any suggestions that make >> >>>>>>> design >> >>>>>>> more >> >>>>>>> intuitive and easy-to-use. >> >>>>>>> >> >>>>>>> Thanks! >> >>>>>>> >> >>>>>>> [1] >> >>>>>>> >> >>>>>>> >> >>>>> >> >>> >> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/cluster/ClusterState.java >> >>>>>>> >> >>>>>>> On Mon, Aug 31, 2020 at 12:32 PM Ivan Pavlukhin < >> vololo...@gmail.com >> >>>> >> >>>>>>> wrote: >> >>>>>>> >> >>>>>>>> Hi Sergey, >> >>>>>>>> >> >>>>>>>> Thank you for bringing attention to that important subject! >> >>>>>>>> >> >>>>>>>> My note here is about one more cluster mode. As far as I know >> >>>>>>>> currently we already have 3 modes (inactive, read-only, >> read-write) >> >>>>>>>> and the subject is about one more. From the first glance it >> >>>>>>>> could >> >>> be >> >>>>>>>> hard for a user to understand and use all modes properly. Do we >> >>>>>>>> really >> >>>>>>>> need all spectrum? Could we simplify things somehow? >> >>>>>>>> >> >>>>>>>> 2020-08-27 15:59 GMT+03:00, Sergey Chugunov >> >>>>>>>> <sergey.chugu...@gmail.com>: >> >>>>>>>>> Hello Nikolay, >> >>>>>>>>> >> >>>>>>>>> Created one, available by link [1] >> >>>>>>>>> >> >>>>>>>>> Initially there was an intention to develop it under IEP-47 [2] >> >>>>>>>>> and >> >>>>>>> there >> >>>>>>>>> is even a separate section for Maintenance Mode there. >> >>>>>>>>> But it looks like this feature is useful in more cases and >> >>>>>>>>> deserves >> >>>>>>>>> its >> >>>>>>>> own >> >>>>>>>>> IEP. >> >>>>>>>>> >> >>>>>>>>> [1] >> >>>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>>> >> >>> >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-53%3A+Maintenance+Mode >> >>>>>>>>> [2] >> >>>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>>> >> >>> >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-47:+Native+persistence+defragmentation >> >>>>>>>>> >> >>>>>>>>> On Thu, Aug 27, 2020 at 11:01 AM Nikolay Izhikov >> >>>>>>>>> <nizhi...@apache.org> >> >>>>>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>>> Hello, Sergey! >> >>>>>>>>>> >> >>>>>>>>>> Thanks for the proposal. >> >>>>>>>>>> Let’s have IEP for this feature. >> >>>>>>>>>> >> >>>>>>>>>>> 27 авг. 2020 г., в 10:25, Sergey Chugunov < >> >>>>>>> sergey.chugu...@gmail.com> >> >>>>>>>>>> написал(а): >> >>>>>>>>>>> >> >>>>>>>>>>> Hello Igniters, >> >>>>>>>>>>> >> >>>>>>>>>>> I want to start a discussion about new supporting feature >> >>>>>>>>>>> that >> >>>>>>>>>>> could >> >>>>>>>> be >> >>>>>>>>>>> very useful in many scenarios where persistent storage is >> >>>>>>>>>>> involved: >> >>>>>>>>>>> Maintenance Mode. >> >>>>>>>>>>> >> >>>>>>>>>>> *Summary* >> >>>>>>>>>>> Maintenance Mode (MM for short) is a special state of Ignite >> >>>>>>>>>>> node >> >>>>>>> when >> >>>>>>>>>> node >> >>>>>>>>>>> doesn't serve user requests nor joins the cluster but waits >> >>> for >> >>>>>>>>>>> user >> >>>>>>>>>>> commands or performs automatic actions for maintenance >> >>>>>>>>>>> purposes. >> >>>>>>>>>>> >> >>>>>>>>>>> *Motivation* >> >>>>>>>>>>> There are situations when node cannot participate in regular >> >>>>>>>> operations >> >>>>>>>>>> but >> >>>>>>>>>>> at the same time should not be shut down. >> >>>>>>>>>>> >> >>>>>>>>>>> One example is a ticket [1] where I developed the first draft >> >>>>>>>>>>> of >> >>>>>>>>>>> Maintenance Mode. >> >>>>>>>>>>> Here we get into a situation when node has potentially >> >>>>>>>>>>> corrupted >> >>>>>>>>>>> PDS >> >>>>>>>>>>> thus >> >>>>>>>>>>> cannot proceed with restore routine and join the cluster as >> >>>>> usual. >> >>>>>>>>>>> At the same time node should not fail nor be stopped for >> >>> manual >> >>>>>>>>>>> cleanup. >> >>>>>>>>>>> Manual cleanup is not always an option (e.g. restricted >> >>>>>>>>>>> access >> >>>>>>>>>>> to >> >>>>>>> file >> >>>>>>>>>>> system); in managed environments failed node will be >> >>>>>>>>>>> restarted >> >>>>>>>>>>> automatically so user won't have time for performing >> >>>>>>>>>>> necessary >> >>>>>>>>>> operations. >> >>>>>>>>>>> Thus node needs to function in a special mode allowing user >> >>>>>>>>>>> to >> >>>>>>> connect >> >>>>>>>>>>> to >> >>>>>>>>>>> it and perform necessary actions. >> >>>>>>>>>>> >> >>>>>>>>>>> Another example is described in IEP-47 [2] where >> >>>>>>>>>>> defragmentation >> >>>>>>>>>>> is >> >>>>>>>>>>> being >> >>>>>>>>>>> developed. Node defragmenting its PDS should not join the >> >>>>>>>>>>> cluster >> >>>>>>>> until >> >>>>>>>>>> the >> >>>>>>>>>>> process is finished so it needs to enter Maintenance Mode as >> >>>>> well. >> >>>>>>>>>>> >> >>>>>>>>>>> *Suggested design* >> >>>>>>>>>>> I suggest MM to work as follows: >> >>>>>>>>>>> 1. Node enters MM if special markers are found on disk. These >> >>>>>>> markers >> >>>>>>>>>>> called Maintenance Records could be created automatically >> >>> (e.g. >> >>>>>>>>>>> when >> >>>>>>>>>>> storage component detects corrupted storage) or by user >> >>> request >> >>>>>>> (when >> >>>>>>>>>> user >> >>>>>>>>>>> requests defragmentation of some caches). So entering MM >> >>>>>>>>>>> requires >> >>>>>>> node >> >>>>>>>>>>> restart. >> >>>>>>>>>>> 2. Started in MM node doesn't join the cluster but finishes >> >>>>>>>>>>> startup >> >>>>>>>>>> routine >> >>>>>>>>>>> so it is able to receive commands and provide metrics to the >> >>>>> user. >> >>>>>>>>>>> 3. When all necessary maintenance operations are finished, >> >>>>>>> Maintenance >> >>>>>>>>>>> Records for these operations are deleted from disk and node >> >>>>>>> restarted >> >>>>>>>>>> again >> >>>>>>>>>>> to enter normal service. >> >>>>>>>>>>> >> >>>>>>>>>>> *Example* >> >>>>>>>>>>> To put it into a context let's consider an example of how I >> >>> see >> >>>>>>>>>>> the >> >>>>>>> MM >> >>>>>>>>>>> workflow in case of PDS corruption. >> >>>>>>>>>>> >> >>>>>>>>>>> 1. Node has failed in the middle of checkpoint when WAL is >> >>>>>>> disabled >> >>>>>>>>>>> for >> >>>>>>>>>>> a particular cache -> data files of the cache are >> >>> potentially >> >>>>>>>>>> corrupted. >> >>>>>>>>>>> 2. On next startup node detects this situation, creates >> >>>>>>> Maintenance >> >>>>>>>>>>> Record on disk and shuts down. >> >>>>>>>>>>> 3. On next startup node sees Maintenance Record, enters >> >>>>>>> Maintenance >> >>>>>>>>>> Mode >> >>>>>>>>>>> and waits for user to do specific actions: clean potentially >> >>>>>>>>>>> corrupted >> >>>>>>>>>> PDS. >> >>>>>>>>>>> 4. When user has done necessary actions he/she removes >> >>>>>>>>>>> Maintenance >> >>>>>>>>>>> Record using Maintenance Mode API exposed via >> >>>>>>>>>>> control.{sh|bat} >> >>>>>>>> script >> >>>>>>>>>> or >> >>>>>>>>>>> JMX. >> >>>>>>>>>>> 5. On next startup node goes to normal operations as >> >>>>> maintenance >> >>>>>>>>>>> reason >> >>>>>>>>>>> is fixed. >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> I prepared a PR [3] for ticket [1] with draft implementation. >> >>>>>>>>>>> It >> >>>>>>>>>>> is >> >>>>>>>> not >> >>>>>>>>>>> ready to be merged to master branch but is already fully >> >>>>>>>>>>> functional >> >>>>>>>> and >> >>>>>>>>>> can >> >>>>>>>>>>> be reviewed. >> >>>>>>>>>>> >> >>>>>>>>>>> Hope you'll share your feedback on the feature and/or any >> >>>>> thoughts >> >>>>>>> on >> >>>>>>>>>>> implementation. >> >>>>>>>>>>> >> >>>>>>>>>>> Thank you! >> >>>>>>>>>>> >> >>>>>>>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-13366 >> >>>>>>>>>>> [2] >> >>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>>> >> >>> >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-47:+Native+persistence+defragmentation >> >>>>>>>>>>> [3] https://github.com/apache/ignite/pull/8189 >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> -- >> >>>>>>>> >> >>>>>>>> Best regards, >> >>>>>>>> Ivan Pavlukhin >> >>>>>>>> >> >>>>>>> >> >>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> Vladislav Pyatkov >> >>>>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> >> >>>>> Best regards, >> >>>>> Ivan Pavlukhin >> >>>>> >> >>>> >> >>> >> >>> >> >>> -- >> >>> >> >>> Best regards, >> >>> Ivan Pavlukhin >> >>> >> > >> > >> >> >> -- >> >> Best regards, >> Ivan Pavlukhin >> > -- Best regards, Ivan Pavlukhin