Anton, Great feature!
Could you please clarify a bit about implementation details? As I understood auto-adjust properties is meant to be consistent across the cluster. And baseline adjustment is put into some delay queue. Do we put event into a queue on each node? Or is there some dedicated node driving baseline adjustment? пт, 25 янв. 2019 г. в 16:31, Anton Kalashnikov <kaa....@yandex.ru>: > > Initially, hard timeout should protect grid from constantly changing > topology(constantly blinking node). But in fact if we have constantly > changing topology, baseline adjust operation is failed in most cases. As > result hard timeout only added complexity but it don't give any new > guarantee. So I think we can skip it in first implementation. > > First of all timeout protect us from unnecessary adjust of baseline . If node > left the grid and immediately(or after some time less than us timeout) it > join back to grid. Also timeout is helpful in other cases when some events > happened one after another. > > This feature doesn't have any complex heuristic to react, except of described > in restrictions section. > > Also I want to notes that this feature isn't protect us from constantly > blinking node. We need one more heuristic mechanism for detect this situation > and doing some actions like removing this node from grid. > > -- > Best regards, > Anton Kalashnikov > > > 25.01.2019, 15:43, "Sergey Chugunov" <sergey.chugu...@gmail.com>: > > Anton, > > > > As I understand from the IEP document policy was supposed to support two > > timeouts: soft and hard, so here you're proposing a bit simpler > > functionality. > > > > Just to clarify, do I understand correctly that this feature when enabled > > will auto-adjust blt on each node join/node left event, and timeout is > > necessary to protect us from blinking nodes? > > So no complexities with taking into account number of alive backups or > > something like that? > > > > On Fri, Jan 25, 2019 at 1:11 PM Vladimir Ozerov <voze...@gridgain.com> > > wrote: > > > >> Got it, makes sense. > >> > >> On Fri, Jan 25, 2019 at 11:06 AM Anton Kalashnikov <kaa....@yandex.ru> > >> wrote: > >> > >> > Vladimir, thanks for your notes, both of them looks good enough but I > >> > have two different thoughts about it. > >> > > >> > I think I agree about enabling only one of manual/auto adjustment. It is > >> > easier than current solution and in fact as extra feature we can allow > >> > user to force task to execute(if they doesn't want to wait until timeout > >> > expired). > >> > But about second one I don't sure that one parameters instead of two > >> would > >> > be more convenient. For example: in case when user changed timeout and > >> then > >> > disable auto-adjust after then when someone will want to enable it they > >> > should know what value of timeout was before auto-adjust was disabled. I > >> > think "negative value" pattern good choice for always usable parameters > >> > like timeout of connection (ex. -1 equal to endless waiting) and so on, > >> but > >> > in our case we want to disable whole functionality rather than change > >> > parameter value. > >> > > >> > -- > >> > Best regards, > >> > Anton Kalashnikov > >> > > >> > > >> > 24.01.2019, 22:03, "Vladimir Ozerov" <voze...@gridgain.com>: > >> > > Hi Anton, > >> > > > >> > > This is great feature, but I am a bit confused about automatic > >> disabling > >> > of > >> > > a feature during manual baseline adjustment. This may lead to > >> unpleasant > >> > > situations when a user enabled auto-adjustment, then re-adjusted it > >> > > manually somehow (e.g. from some previously created script) so that > >> > > auto-adjustment disabling went unnoticed, then added more nodes hoping > >> > that > >> > > auto-baseline is still active, etc. > >> > > > >> > > Instead, I would rather make manual and auto adjustment mutually > >> > exclusive > >> > > - baseline cannot be adjusted manually when auto mode is set, and vice > >> > > versa. If exception is thrown in that cases, administrators will > >> always > >> > > know current behavior of the system. > >> > > > >> > > As far as configuration, wouldn’t it be enough to have a single long > >> > value > >> > > as opposed to Boolean + long? Say, 0 - immediate auto adjustment, > >> > negative > >> > > - disabled, positive - auto adjustment after timeout. > >> > > > >> > > Thoughts? > >> > > > >> > > чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov <kaa....@yandex.ru>: > >> > > > >> > >> Hello, Igniters! > >> > >> > >> > >> Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I > >> > want > >> > >> to start to discuss of implementation of "Baseline auto-adjust" [2]. > >> > >> > >> > >> "Baseline auto-adjust" feature implements mechanism of auto-adjust > >> > >> baseline corresponding to current topology after event join/left was > >> > >> appeared. It is required because when a node left the grid and nobody > >> > would > >> > >> change baseline manually it can lead to lost data(when some more > >> nodes > >> > left > >> > >> the grid on depends in backup factor) but permanent tracking of grid > >> > is not > >> > >> always possible/desirible. Looks like in many cases auto-adjust > >> > baseline > >> > >> after some timeout is very helpfull. > >> > >> > >> > >> Distributed metastore[3](it is already done): > >> > >> > >> > >> First of all it is required the ability to store configuration data > >> > >> consistently and cluster-wide. Ignite doesn't have any specific API > >> for > >> > >> such configurations and we don't want to have many similar > >> > implementations > >> > >> of the same feature in our code. After some thoughts is was proposed > >> to > >> > >> implement it as some kind of distributed metastorage that gives the > >> > ability > >> > >> to store any data in it. > >> > >> First implementation is based on existing local metastorage API for > >> > >> persistent clusters (in-memory clusters will store data in memory). > >> > >> Write/remove operation use Discovery SPI to send updates to the > >> > cluster, it > >> > >> guarantees updates order and the fact that all existing (alive) nodes > >> > have > >> > >> handled the update message. As a way to find out which node has the > >> > latest > >> > >> data there is a "version" value of distributed metastorage, which is > >> > >> basically <number of all updates, hash of updates>. All updates > >> history > >> > >> until some point in the past is stored along with the data, so when > >> an > >> > >> outdated node connects to the cluster it will receive all the missing > >> > data > >> > >> and apply it locally. If there's not enough history stored or joining > >> > node > >> > >> is clear then it'll receive shapshot of distributed metastorage so > >> > there > >> > >> won't be inconsistencies. > >> > >> > >> > >> Baseline auto-adjust: > >> > >> > >> > >> Main scenario: > >> > >> - There is grid with the baseline is equal to the current > >> > topology > >> > >> - New node joins to grid or some node left(failed) the grid > >> > >> - New mechanism detects this event and it add task for > >> changing > >> > >> baseline to queue with configured timeout > >> > >> - If new event are happened before baseline would be changed > >> > task > >> > >> would be removed from queue and new task will be added > >> > >> - When timeout are expired the task would try to set new > >> > baseline > >> > >> corresponded to current topology > >> > >> > >> > >> First of all we need to add two parameters[4]: > >> > >> - baselineAutoAdjustEnabled - enable/disable "Baseline > >> > >> auto-adjust" feature. > >> > >> - baselineAutoAdjustTimeout - timeout after which baseline > >> > should > >> > >> be changed. > >> > >> > >> > >> This parameters are cluster wide and can be changed in real time > >> > because > >> > >> it is based on "Distributed metastore". On first time this parameters > >> > would > >> > >> be initiated by corresponded > >> parameters(initBaselineAutoAdjustEnabled, > >> > >> initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init > >> value > >> > >> valid only before first changing of it after value would be changed > >> it > >> > is > >> > >> stored in "Distributed metastore". > >> > >> > >> > >> Restrictions: > >> > >> - This mechanism handling events only on active grid > >> > >> - If baselineNodes != gridNodes on activate this feature > >> would > >> > be > >> > >> disabled > >> > >> - If lost partitions was detected this feature would be > >> > disabled > >> > >> - If baseline was adjusted manually on baselineNodes != > >> > gridNodes > >> > >> this feature would be disabled > >> > >> > >> > >> Draft implementation you can find here[5]. Feel free to ask more > >> > details > >> > >> and make suggestions. > >> > >> > >> > >> [1] > >> > >> > >> > > >> > >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches > >> > >> [2] https://issues.apache.org/jira/browse/IGNITE-8571 > >> > >> [3] https://issues.apache.org/jira/browse/IGNITE-10640 > >> > >> [4] https://issues.apache.org/jira/browse/IGNITE-8573 > >> > >> [5] https://github.com/apache/ignite/pull/5907 > >> > >> > >> > >> -- > >> > >> Best regards, > >> > >> Anton Kalashnikov > >> > -- Best regards, Ivan Pavlukhin