Re: [DISCUSSION] Deprecation of obsolete rebalancing functionality

Alexei Scherbakov Tue, 18 Feb 2020 01:22:03 -0800

Folks,

Looks like we came to an agreement - rebalanceDelay  and rebalanceOrder
should be deprecated.


Anyone else has objections ?


чт, 13 февр. 2020 г. в 14:40, Alexei Scherbakov <
alexey.scherbak...@gmail.com>:

> But in combination with BLT it will work as intended - no rebalancing
> under the cover.
>
> чт, 13 февр. 2020 г. в 14:39, Alexei Scherbakov <
> alexey.scherbak...@gmail.com>:
>
>> Of course, stable topology will be just a hint.
>>
>> Any node can leave at any moment.
>>
>> чт, 13 февр. 2020 г. в 14:35, Alexei Scherbakov <
>> alexey.scherbak...@gmail.com>:
>>
>>> 1. Yes
>>>
>>> 2. This is right but doesn't sound like a bug. The rebalancing will be
>>> finished before releasing syncFut and partitions will contain all necessary
>>> data (but are still in moving state).
>>>
>>> 3. No, local node doesn't wait the rebalancing on all grid nodes.
>>>
>>> Actually, I think SYNC mode should be dropped as well. Instead we must
>>> provide the convenient public API to wait for "stable" topology.
>>>
>>>
>>> чт, 13 февр. 2020 г. в 14:09, Maxim Muzafarov <mmu...@apache.org>:
>>>
>>>> Pavel,
>>>>
>>>> It's still a big question regarding SYNC rebalance mode. Here is my
>>>> thoughts.
>>>>
>>>> 1. Yes, we must rebalance such caches prior to ASYNC one (if the
>>>> rebalanceOrder configuration will be removed).
>>>>
>>>> 2. When persistence is enabled and when WAL is disabled (on the first
>>>> rebalance start), I think we should finish syncFuture only on
>>>> checkpoint like we are enabling the WAL state for cache group and
>>>> simultaneously owning all MOVING partitions. But currently, I've seen
>>>> that syncFuture finishes when there are no remaining partitions left
>>>> [1].
>>>> Is it correct? Seems like a bug.
>>>>
>>>> 3. In my understanding, a new local node can start only when ALL SYNC
>>>> cache groups have been fully rebalanced on ALL nodes, right? But how
>>>> about late affinity assignment here? It seems that SYNC caches will be
>>>> rebalanced locally on the node, the node will start, but other nodes
>>>> still think this node is not operational (late affinity assignment not
>>>> occurred yet).
>>>>
>>>>
>>>> [1]
>>>> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/distributed/dht/preloader/GridDhtPartitionDemander.java#L1561
>>>>
>>>> On Thu, 13 Feb 2020 at 12:57, Pavel Pereslegin <xxt...@gmail.com>
>>>> wrote:
>>>> >
>>>> > > +1 to deprecate rebalanceOrder and remove related functionality,
>>>> > Meant to "rework related functionality" not "remove".
>>>> >
>>>> > чт, 13 февр. 2020 г. в 12:47, Pavel Pereslegin <xxt...@gmail.com>:
>>>> > >
>>>> > > Hello,
>>>> > >
>>>> > > +1 to deprecate rebalanceOrder and remove related functionality,
>>>> > > should we create a separate ticket for this?
>>>> > >
>>>> > > Btw, as I understand, SYNC mode is only useful for in-memory caches,
>>>> > > because when persistence is enabled (and WAL is disabled during
>>>> > > rebalancing), even "ignite-sys-cache" owns partitions only after all
>>>> > > cache groups are rebalanced. Thus, even utility cache is still
>>>> > > inoperable after node startup when persistence is enabled. Do we
>>>> > > really need to wait for SYNC caches when a node starts with enabled
>>>> > > persistence or should we enabled WAL for SYNC-caches?
>>>> > >
>>>> > > чт, 13 февр. 2020 г. в 11:13, Ivan Rakov <ivan.glu...@gmail.com>:
>>>> > > >
>>>> > > > Hello,
>>>> > > >
>>>> > > > +1 from me for rebalance delay deprecation.
>>>> > > > I can imagine only one actual case for this option: prevent
>>>> excessive load
>>>> > > > on the cluster in case of temporary short-term topology changes
>>>> (e.g. node
>>>> > > > is stopped for a while and then returned back).
>>>> > > > Now it's handled by baseline auto adjustment in a much more
>>>> correct way:
>>>> > > > partitions are not reassigned within a maintenance interval
>>>> (unlike with
>>>> > > > the rebalance delay).
>>>> > > > I also don't think that ability to configure rebalance delay per
>>>> cache is
>>>> > > > crucial.
>>>> > > >
>>>> > > > > rebalanceOrder is also useless, agreed.
>>>> > > > +1
>>>> > > > Except for one case: we may want to rebalance caches with
>>>> > > > CacheRebalanceMode.SYNC first. But anyway, this behavior doesn't
>>>> require a
>>>> > > > separate property to be enabled.
>>>> > > >
>>>> > > > On Wed, Feb 12, 2020 at 4:54 PM Alexei Scherbakov <
>>>> > > > alexey.scherbak...@gmail.com> wrote:
>>>> > > >
>>>> > > > > Maxim,
>>>> > > > >
>>>> > > > > rebalanceDelay was introduced before the BLT appear in the
>>>> product to solve
>>>> > > > > scenarios which are now solved by BLT.
>>>> > > > >
>>>> > > > > It's pointless for me having it in the product since BLT was
>>>> introduced.
>>>> > > > >
>>>> > > > > I do not think delaying rebalancing per cache group has any
>>>> meaning. I
>>>> > > > > cannot image any reason for it.
>>>> > > > >
>>>> > > > > rebalanceOrder is also useless, agreed.
>>>> > > > >
>>>> > > > >
>>>> > > > >
>>>> > > > >
>>>> > > > > ср, 12 февр. 2020 г. в 16:19, Maxim Muzafarov <
>>>> mmu...@apache.org>:
>>>> > > > >
>>>> > > > > > Alexey,
>>>> > > > > >
>>>> > > > > > Why do you think delaying of historical rebalance (on BLT
>>>> node join)
>>>> > > > > > for particular cache groups is not the real world use case?
>>>> Probably
>>>> > > > > > the same topic may be started on user-list to collect more
>>>> use cases
>>>> > > > > > from real users.
>>>> > > > > >
>>>> > > > > > In general, I support reducing the number of available
>>>> rebalance
>>>> > > > > > configuration parameters, but we should do it really
>>>> carefully.
>>>> > > > > > I can also propose - rebalanceOrder param for removing.
>>>> > > > > >
>>>> > > > > > On Wed, 12 Feb 2020 at 15:50, Alexei Scherbakov
>>>> > > > > > <alexey.scherbak...@gmail.com> wrote:
>>>> > > > > > >
>>>> > > > > > > Maxim,
>>>> > > > > > >
>>>> > > > > > > In general rebalanceDelay is used to delay/disable
>>>> rebalance then
>>>> > > > > > topology
>>>> > > > > > > is changed.
>>>> > > > > > > Right now we have BLT to avoid unnecesary rebalancing when
>>>> topology is
>>>> > > > > > > changed.
>>>> > > > > > > If a node left from cluster topology no rebalancing happens
>>>> until the
>>>> > > > > > node
>>>> > > > > > > explicitly removed from baseline topology.
>>>> > > > > > >
>>>> > > > > > > I would like to know real world scenarios which can not be
>>>> covered by
>>>> > > > > BLT
>>>> > > > > > > configuration.
>>>> > > > > > >
>>>> > > > > > >
>>>> > > > > > >
>>>> > > > > > > ср, 12 февр. 2020 г. в 15:16, Maxim Muzafarov <
>>>> mmu...@apache.org>:
>>>> > > > > > >
>>>> > > > > > > > Alexey,
>>>> > > > > > > >
>>>> > > > > > > > > All scenarios where rebalanceDelay has meaning are
>>>> handled by
>>>> > > > > > baseline
>>>> > > > > > > > topology now.
>>>> > > > > > > >
>>>> > > > > > > > Can you, please, provide more details here e.g. the whole
>>>> list of
>>>> > > > > > > > scenarios where rebalanceDelay is used and how these
>>>> handled by
>>>> > > > > > > > baseline topology?
>>>> > > > > > > >
>>>> > > > > > > > Actually, I doubt that it covers exactly all the cases
>>>> due to
>>>> > > > > > > > rebalanceDelay is a "per cache group property" rather
>>>> than "baseline"
>>>> > > > > > > > is meaningful for the whole topology.
>>>> > > > > > > >
>>>> > > > > > > > On Wed, 12 Feb 2020 at 12:58, Alexei Scherbakov
>>>> > > > > > > > <alexey.scherbak...@gmail.com> wrote:
>>>> > > > > > > > >
>>>> > > > > > > > > I've meant baseline topology.
>>>> > > > > > > > >
>>>> > > > > > > > > ср, 12 февр. 2020 г. в 12:41, Alexei Scherbakov <
>>>> > > > > > > > > alexey.scherbak...@gmail.com>:
>>>> > > > > > > > >
>>>> > > > > > > > > >
>>>> > > > > > > > > > V.Pyatkov
>>>> > > > > > > > > >
>>>> > > > > > > > > > Doesn't rebalance topology solves it ?
>>>> > > > > > > > > >
>>>> > > > > > > > > > ср, 12 февр. 2020 г. в 12:31, V.Pyatkov <
>>>> vldpyat...@gmail.com>:
>>>> > > > > > > > > >
>>>> > > > > > > > > >> Hi,
>>>> > > > > > > > > >>
>>>> > > > > > > > > >> I am sure we can to reduce this ability, but do not
>>>> completely.
>>>> > > > > > > > > >> We can use rebalance delay for disable it until
>>>> manually
>>>> > > > > > triggered.
>>>> > > > > > > > > >>
>>>> > > > > > > > > >> CacheConfiguration#setRebalanceDelay(-1)
>>>> > > > > > > > > >>
>>>> > > > > > > > > >> It may helpful for cluster where can not allow
>>>> performance drop
>>>> > > > > > from
>>>> > > > > > > > > >> rebalance at any time.
>>>> > > > > > > > > >>
>>>> > > > > > > > > >>
>>>> > > > > > > > > >>
>>>> > > > > > > > > >> --
>>>> > > > > > > > > >> Sent from:
>>>> > > > > http://apache-ignite-developers.2346864.n4.nabble.com/
>>>> > > > > > > > > >>
>>>> > > > > > > > > >
>>>> > > > > > > > > >
>>>> > > > > > > > > > --
>>>> > > > > > > > > >
>>>> > > > > > > > > > Best regards,
>>>> > > > > > > > > > Alexei Scherbakov
>>>> > > > > > > > > >
>>>> > > > > > > > >
>>>> > > > > > > > >
>>>> > > > > > > > > --
>>>> > > > > > > > >
>>>> > > > > > > > > Best regards,
>>>> > > > > > > > > Alexei Scherbakov
>>>> > > > > > > >
>>>> > > > > > >
>>>> > > > > > >
>>>> > > > > > > --
>>>> > > > > > >
>>>> > > > > > > Best regards,
>>>> > > > > > > Alexei Scherbakov
>>>> > > > > >
>>>> > > > >
>>>> > > > >
>>>> > > > > --
>>>> > > > >
>>>> > > > > Best regards,
>>>> > > > > Alexei Scherbakov
>>>> > > > >
>>>>
>>>
>>>
>>> --
>>>
>>> Best regards,
>>> Alexei Scherbakov
>>>
>>
>>
>> --
>>
>> Best regards,
>> Alexei Scherbakov
>>
>
>
> --
>
> Best regards,
> Alexei Scherbakov
>


-- 

Best regards,
Alexei Scherbakov

Re: [DISCUSSION] Deprecation of obsolete rebalancing functionality

Reply via email to