Re: Re:Re: [DISCUSS] PIP-255: Assign topic partitions to bundle by round robin

Heesung Sohn Thu, 11 May 2023 12:59:11 -0700

Hi,

As pip-192(load balancer extension) has been added in pulsar-3.0,
could you also clarify how this strategy will be compatible with the load
balancer extension?
>From my understanding, this partition assignment strategy can also be
configurable in the load balancer extension. Can you confirm?


Thanks,
Heesung

On Mon, Apr 24, 2023 at 8:43 AM Yunze Xu <[email protected]>
wrote:

> This proposal is easier to understand than before. Overall LGTM. But I
> think these `onBundleXXX` methods could be default so that we can
> implement it with a simple lambda.
>
> Thanks,
> Yunze
>
> On Wed, Apr 19, 2023 at 10:22 AM Lin Lin <[email protected]> wrote:
> >
> > We  make this configuration item to be a dynamic configuration.
> > We change change it on broker level.
> > If we can change it on namespace level, even load of bundle in some
> namespace is balanced, it is still difficult to make broker  balance
> >
> > On 2023/04/16 16:07:45 lifepuzzlefun wrote:
> > > I think this feature is very helpful on heavy traffic topic which have
> continuous stable load on each partition.
> > >
> > >
> > > Is there a way we can set some kind of namespace policy to set the
> plugin PartitionAssigner. Hope this can be set on namespace level,
> > > if this can be achieved, it is more adoptable to try this feature in
> production environment. : - )
> > >
> > > At 2023-04-12 11:24:11, "Lin Lin" <[email protected]> wrote:
> > > >As I mentioned in the implementation of PIP, we will plug-in the
> partition assignment strategy.
> > > >
> > > >However, in the same cluster, it is impossible for some Brokers to
> use consistent hashing and some Brokers to use round robin.
> > > >
> > > >On 2023/04/11 07:37:19 Xiangying Meng wrote:
> > > >> Hi Linlin,
> > > >> > This is an incompatible modification, so the entire cluster needs
> to be
> > > >> upgraded, not just a part of the nodes
> > > >>
> > > >> Appreciate your contribution to the new feature in PIP-255.
> > > >>  I have a question regarding the load-balancing aspect of this
> feature.
> > > >>
> > > >> You mentioned that this is an incompatible modification,
> > > >> and the entire cluster needs to be upgraded, not just a part of the
> nodes.
> > > >>  I was wondering why we can only have one load-balancing strategy.
> > > >> Would it be possible to abstract the logic here and make it an
> optional
> > > >> choice?
> > > >> This way, we could have multiple load-balancing strategies,
> > > >> such as hash-based, round-robin, etc., available for users to
> choose from.
> > > >>
> > > >> I'd love to hear your thoughts on this.
> > > >>
> > > >> Best regards,
> > > >> Xiangying
> > > >>
> > > >> On Mon, Apr 10, 2023 at 8:23 PM PengHui Li <[email protected]>
> wrote:
> > > >>
> > > >> > Hi Lin,
> > > >> >
> > > >> > > The load managed by each Bundle is not even. Even if the number
> of
> > > >> > partitions managed
> > > >> >    by each bundle is the same, there is no guarantee that the sum
> of the
> > > >> > loads of these partitions
> > > >> >    will be the same.
> > > >> >
> > > >> > Do we expect that the bundles should have the same loads? The
> bundle is the
> > > >> > base unit of the
> > > >> > load balancer, we can set the high watermark of the bundle, e.g.,
> the
> > > >> > maximum topics and throughput.
> > > >> > But the bundle can have different real loads, and if one bundle
> runs out of
> > > >> > the high watermark, the bundle
> > > >> > will be split. Users can tune the high watermark to distribute
> the loads
> > > >> > evenly across brokers.
> > > >> >
> > > >> > For example, there are 4 bundles with loads 1, 3, 2, 4, the
> maximum load of
> > > >> > a bundle is 5 and 2 brokers.
> > > >> > We can assign bundle 0 and bundle 3 to broker-0 and bundle 1 and
> bundle 2
> > > >> > to broker-2.
> > > >> >
> > > >> > Of course, this is the ideal situation. If bundle 0 has been
> assigned to
> > > >> > broker-0 and bundle 1 has been
> > > >> > assigned to broker-1. Now, bundle 2 will go to broker 1, and
> bundle 3 will
> > > >> > go to broker 1. The loads for each
> > > >> > broker are 3 and 7. Dynamic programming can help to find an
> optimized
> > > >> > solution with more bundle unloads.
> > > >> >
> > > >> > So, should we design the bundle to have even loads? It is
> difficult to
> > > >> > achieve in reality. And the proposal
> > > >> > said, "Let each bundle carry the same load as possible". Is it
> the correct
> > > >> > direction for the load balancer?
> > > >> >
> > > >> > > Doesn't shed loads very well. The existing default policy
> > > >> > ThresholdShedder has a relatively high usage
> > > >> >    threshold, and various traffic thresholds need to be set. Many
> clusters
> > > >> > with high TPS and small message
> > > >> >    bodies may have high CPU but low traffic; And for many
> small-scale
> > > >> > clusters, the threshold needs to be
> > > >> >    modified according to the actual business.
> > > >> >
> > > >> > Can it be resolved by introducing the entry write/read rate to
> the bundle
> > > >> > stats?
> > > >> >
> > > >> > > The removed Bundle cannot be well distributed to other Brokers.
> The load
> > > >> > information of each Broker
> > > >> >    will be reported at regular intervals, so the judgment of the
> Leader
> > > >> > Broker when allocating Bundles cannot
> > > >> >    be guaranteed to be completely correct. Secondly, if there are
> a large
> > > >> > number of Bundles to be redistributed,
> > > >> >    the Leader may make the low-load Broker a new high-load node
> when the
> > > >> > load information is not up-to-date.
> > > >> >
> > > >> > Can we try to force-sync the load data of the brokers before
> performing the
> > > >> > distribution of a large number of
> > > >> > bundles?
> > > >> >
> > > >> > For the Goal section in the proposal. It looks like it doesn't
> map to the
> > > >> > issues mentioned in the Motivation section.
> > > >> > IMO, the proposal should clearly describe the Goal, like which
> problem will
> > > >> > be resolved with this proposal.
> > > >> > Both of the above 3 issues or part of them. And what is the
> high-level
> > > >> > solution to resolve the issue,
> > > >> > and what are the pros and cons compared with the existing
> solution without
> > > >> > diving into the implementation section.
> > > >> >
> > > >> > Another consideration is the default max bundles of a namespace
> is 128. I
> > > >> > don't think the common cases that need
> > > >> > to set 128 partitions for a topic. If the partitions < the
> bundle's count,
> > > >> > will the new solution basically be equivalent to
> > > >> > the current way?
> > > >> >
> > > >> > If this is not a general solution for common scenarios. I support
> making
> > > >> > the topic-bundle assigner pluggable without
> > > >> > introducing the implementation to the Pulsar repo. Users can
> implement
> > > >> > their own assigner based on the business
> > > >> > requirement. Pulsar's general solution may not be good for all
> scenarios,
> > > >> > but it is better for scalability (bundle split)
> > > >> > and enough for most common scenarios. We can keep improving the
> general
> > > >> > solution for the general requirement
> > > >> > for the most common scenarios.
> > > >> >
> > > >> > Regards,
> > > >> > Penghui
> > > >> >
> > > >> >
> > > >> > On Wed, Mar 22, 2023 at 9:52 AM Lin Lin <[email protected]>
> wrote:
> > > >> >
> > > >> > >
> > > >> > > > This appears to be the "round-robin topic-to-bundle mapping"
> option in
> > > >> > > > the `fundBundle` function. Is this the only place that needs
> an update?
> > > >> > > Can
> > > >> > > > you list what change is required?
> > > >> > >
> > > >> > > In this PIP, we only discuss topic-to-bundle mapping
> > > >> > > Change is required:
> > > >> > > 1)
> > > >> > > When lookup, partitions is assigned to bundle:
> > > >> > > Lookup -> NamespaceService#getBrokerServiceUrlAsync ->
> > > >> > > NamespaceService#getBundleAsync ->
> > > >> > > NamespaceBundles#findBundle
> > > >> > > Consistent hashing is now used to assign partitions to bundle in
> > > >> > > NamespaceBundles#findBundle.
> > > >> > > We should add a configuration item partitionAssignerClassName,
> so that
> > > >> > > different partition assignment algorithms can be dynamically
> configured.
> > > >> > > The existing algorithm will be used as the default
> > > >> > > （partitionAssignerClassName=ConsistentHashingPartitionAssigner）
> > > >> > > 2)
> > > >> > > Implement a new partition assignment class
> RoundRobinPartitionAssigner.
> > > >> > > New partition assignments will be implemented in this class
> > > >> > >
> > > >> > >
> > > >> > > > How do we enable this "round-robin topic-to-bundle mapping
> option" (by
> > > >> > > > namespace policy and broker.conf)?
> > > >> > >
> > > >> > > In broker.conf, a new option called `partitionAssignerClassName`
> > > >> > >
> > > >> > > > Can we apply this option to existing namespaces? (what's the
> admin
> > > >> > > > operation to enable this option)?
> > > >> > >
> > > >> > > The cluster must ensure that all nodes use the same algorithm.
> > > >> > > Broker-level configuration can be made effective by restarting
> or admin
> > > >> > API
> > > >> > > BrokersBase#updateDynamicConfiguration
> > > >> > >
> > > >> > > > I assume the "round-robin topic-to-bundle mapping option"
> works with a
> > > >> > > > single partitioned topic, because other topics might show
> different
> > > >> > load
> > > >> > > > per partition. Is this intention? (so users need to ensure
> not to put
> > > >> > > other
> > > >> > > > topics in the namespace, if this option is configured)
> > > >> > >
> > > >> > > For  single-partition topics, since the starting bundle is
> determined
> > > >> > > using a consistent hash.
> > > >> > > Therefore,  single-partition topics will spread out to
> different bundle
> > > >> > as
> > > >> > > much as possible.
> > > >> > > For high load single-partition topics, current algorithms
> cannot solve
> > > >> > > this problem.
> > > >> > > This PIP cannot solve this problem as well.
> > > >> > > If it just a low load single-partition topic , the impact on
> the entire
> > > >> > > bundle is very small.
> > > >> > > However, in real scenarios, high-load businesses will share the
> load
> > > >> > > through multiple partitions.
> > > >> > >
> > > >> > > > Some brokers might have more bundles than other brokers. Do
> we have
> > > >> > > > different logic for bundle balancing across brokers? or do we
> rely on
> > > >> > the
> > > >> > > > existing assign/unload/split logic to balance bundles among
> brokers?
> > > >> > >
> > > >> > > In this PIP, we do not involve the mapping between bundles and
> brokers,
> > > >> > > the existing algorithm works well with this PIP.
> > > >> > > However, we will also contribute our mapping algorithm in the
> subsequent
> > > >> > > PIP.
> > > >> > > For example: bundles under same namespace can be assigned to
> broker in a
> > > >> > > round-robin manner.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
>

Re: Re:Re: [DISCUSS] PIP-255: Assign topic partitions to bundle by round robin

Reply via email to