Re: Baseline auto-adjust`s discuss

Vladimir Ozerov Fri, 25 Jan 2019 01:12:06 -0800

Got it, makes sense.

On Fri, Jan 25, 2019 at 11:06 AM Anton Kalashnikov <kaa....@yandex.ru>
wrote:


> Vladimir, thanks  for your notes, both of them looks good enough but I
> have two different thoughts about it.
>
> I think I agree about enabling only one of manual/auto adjustment. It is
> easier than current solution and in fact as extra feature  we can allow
> user to force task to execute(if they doesn't want to wait until timeout
> expired).
> But about second one I don't sure that one parameters instead of two would
> be more convenient. For example: in case when user changed timeout and then
> disable auto-adjust after then when someone will want to enable it they
> should know what value of timeout was before auto-adjust was disabled. I
> think "negative value" pattern good choice for always usable parameters
> like timeout of connection (ex. -1 equal to endless waiting) and so on, but
> in our case we want to disable whole functionality rather than change
> parameter value.
>
> --
> Best regards,
> Anton Kalashnikov
>
>
> 24.01.2019, 22:03, "Vladimir Ozerov" <voze...@gridgain.com>:
> > Hi Anton,
> >
> > This is great feature, but I am a bit confused about automatic disabling
> of
> > a feature during manual baseline adjustment. This may lead to unpleasant
> > situations when a user enabled auto-adjustment, then re-adjusted it
> > manually somehow (e.g. from some previously created script) so that
> > auto-adjustment disabling went unnoticed, then added more nodes hoping
> that
> > auto-baseline is still active, etc.
> >
> > Instead, I would rather make manual and auto adjustment mutually
> exclusive
> > - baseline cannot be adjusted manually when auto mode is set, and vice
> > versa. If exception is thrown in that cases, administrators will always
> > know current behavior of the system.
> >
> > As far as configuration, wouldn’t it be enough to have a single long
> value
> > as opposed to Boolean + long? Say, 0 - immediate auto adjustment,
> negative
> > - disabled, positive - auto adjustment after timeout.
> >
> > Thoughts?
> >
> > чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov <kaa....@yandex.ru>:
> >
> >>  Hello, Igniters!
> >>
> >>  Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I
> want
> >>  to start to discuss of implementation of "Baseline auto-adjust" [2].
> >>
> >>  "Baseline auto-adjust" feature implements mechanism of auto-adjust
> >>  baseline corresponding to current topology after event join/left was
> >>  appeared. It is required because when a node left the grid and nobody
> would
> >>  change baseline manually it can lead to lost data(when some more nodes
> left
> >>  the grid on depends in backup factor) but permanent tracking of grid
> is not
> >>  always possible/desirible. Looks like in many cases auto-adjust
> baseline
> >>  after some timeout is very helpfull.
> >>
> >>  Distributed metastore[3](it is already done):
> >>
> >>  First of all it is required the ability to store configuration data
> >>  consistently and cluster-wide. Ignite doesn't have any specific API for
> >>  such configurations and we don't want to have many similar
> implementations
> >>  of the same feature in our code. After some thoughts is was proposed to
> >>  implement it as some kind of distributed metastorage that gives the
> ability
> >>  to store any data in it.
> >>  First implementation is based on existing local metastorage API for
> >>  persistent clusters (in-memory clusters will store data in memory).
> >>  Write/remove operation use Discovery SPI to send updates to the
> cluster, it
> >>  guarantees updates order and the fact that all existing (alive) nodes
> have
> >>  handled the update message. As a way to find out which node has the
> latest
> >>  data there is a "version" value of distributed metastorage, which is
> >>  basically <number of all updates, hash of updates>. All updates history
> >>  until some point in the past is stored along with the data, so when an
> >>  outdated node connects to the cluster it will receive all the missing
> data
> >>  and apply it locally. If there's not enough history stored or joining
> node
> >>  is clear then it'll receive shapshot of distributed metastorage so
> there
> >>  won't be inconsistencies.
> >>
> >>  Baseline auto-adjust:
> >>
> >>  Main scenario:
> >>          - There is grid with the baseline is equal to the current
> topology
> >>          - New node joins to grid or some node left(failed) the grid
> >>          - New mechanism detects this event and it add task for changing
> >>  baseline to queue with configured timeout
> >>          - If new event are happened before baseline would be changed
> task
> >>  would be removed from queue and new task will be added
> >>          - When timeout are expired the task would try to set new
> baseline
> >>  corresponded to current topology
> >>
> >>  First of all we need to add two parameters[4]:
> >>          - baselineAutoAdjustEnabled - enable/disable "Baseline
> >>  auto-adjust" feature.
> >>          - baselineAutoAdjustTimeout - timeout after which baseline
> should
> >>  be changed.
> >>
> >>  This parameters are cluster wide and can be changed in real time
> because
> >>  it is based on "Distributed metastore". On first time this parameters
> would
> >>  be initiated by corresponded parameters(initBaselineAutoAdjustEnabled,
> >>  initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init value
> >>  valid only before first changing of it after value would be changed it
> is
> >>  stored in "Distributed metastore".
> >>
> >>  Restrictions:
> >>          - This mechanism handling events only on active grid
> >>          - If baselineNodes != gridNodes on activate this feature would
> be
> >>  disabled
> >>          - If lost partitions was detected this feature would be
> disabled
> >>          - If baseline was adjusted manually on baselineNodes !=
> gridNodes
> >>  this feature would be disabled
> >>
> >>  Draft implementation you can find here[5]. Feel free to ask more
> details
> >>  and make suggestions.
> >>
> >>  [1]
> >>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
> >>  [2] https://issues.apache.org/jira/browse/IGNITE-8571
> >>  [3] https://issues.apache.org/jira/browse/IGNITE-10640
> >>  [4] https://issues.apache.org/jira/browse/IGNITE-8573
> >>  [5] https://github.com/apache/ignite/pull/5907
> >>
> >>  --
> >>  Best regards,
> >>  Anton Kalashnikov
>

Re: Baseline auto-adjust`s discuss

Reply via email to