This proposal only considers using non-persistent topics to sync bundle load 
data, the historical load data will still write into zookeeper, I think we can 
draft another proposal for persistent historical load data to the system topic.

Thanks,
Kai
On Mar 18, 2022, 02:54 +0800, Joe F <joefranc...@gmail.com>, wrote:
> IIRC, there is a historical load profile for topic that feeds into
> decisions by the load balancer.
>
> What happens during a cluster startup, with this new proposal?
>
>
>
>
> On Thu, Mar 17, 2022 at 7:50 AM PengHui Li <peng...@apache.org> wrote:
>
> > > But which brokers will own that topic ?
> > in a Pulsar cluster with a high level of isolation of tenants, we must
> > ensure that:
> > - at least one broker is allowed to own the topic
> > - brokers dedicated to tenants do not own the topic
> > With the current approach the data in on zookeeper, and this is shared
> > among all the brokers
> >
> > We have "pulsar/system" namespace which can be used to maintain
> > system topics. If users consider broker isolation, it's all transparent.
> >
> > Using a topic we also can shared the data among all brokers.
> > Who want a data copy, only need to create a reader when starting.
> > And we have introduced table view, which will make it easier to cache
> > the load data, and perform the load cache update.
> >
> > > Another point:
> > will users be allowed to produce/consume this topic ? how do we deal
> > with permissions =
> >
> > Good point. We should avoid the user's producers/consumers, and only
> > the super user can access the system topic.
> >
> > Thanks,
> > Penghui
> >
> > On Thu, Mar 17, 2022 at 10:08 PM Enrico Olivelli <eolive...@gmail.com>
> > wrote:
> >
> > > Il giorno gio 17 mar 2022 alle ore 02:42 PengHui Li
> > > <peng...@apache.org> ha scritto:
> > > >
> > > > > we do not know
> > > > anything about the availability of the owner of the topic.
> > > >
> > > > If the owner broker is not available, other brokers will take over.
> > > >
> > > > > We could make it simpler and when a broker wants to push its data, it
> > > > looks
> > > > up the REST address of the "leader broker" and then pushes the data to
> > > it,
> > > > I mean, without involving a "topic"
> > > >
> > > > Any broker may become the leader broker, in this case, the brokers need
> > > to
> > > > know all the addresses of the brokers in the cluster. With the topic
> > > > approach,
> > > > they only need to know the topic name.
> > >
> > > I thought about this a little more.
> > > Using a non persistent topic makes sense. So I am closer to be
> > > convinced about this move.
> > >
> > > But which brokers will own that topic ?
> > > in a Pulsar cluster with a high level of isolation of tenants, we must
> > > ensure that:
> > > - at least one broker is allowed to own the topic
> > > - brokers dedicated to tenants do not own the topic
> > > With the current approach the data in on zookeeper, and this is shared
> > > among all the brokers
> > >
> > > Another point:
> > > will users be allowed to produce/consume this topic ? how do we deal
> > > with permissions =
> > >
> > >
> > > Enrico
> > >
> > > >
> > > > Penghui
> > > >
> > > > On Thu, Mar 17, 2022 at 12:35 AM Enrico Olivelli <eolive...@gmail.com>
> > > > wrote:
> > > >
> > > > > But in order to read from a topic you need a broker that is the owner
> > > of
> > > > > the owner of the special "temporary topic".
> > > > >
> > > > > While the metadata service (ZooKeeper) is already a central point and
> > > it is
> > > > > meant to be available (otherwise Pulsar doesn't work), we do not know
> > > > > anything about the availability of the owner of the topic.
> > > > >
> > > > > Or do you mean to create a special topic that is always owned by the
> > > > > "leader broker" ?
> > > > >
> > > > > We could make it simpler and when a broker wants to push its data, it
> > > looks
> > > > > up the REST address of the "leader broker" and then pushes the data
> > to
> > > it,
> > > > > I mean, without involving a "topic".
> > > > >
> > > > >
> > > > > Enrico
> > > > >
> > > > >
> > > > >
> > > > > Il Mer 16 Mar 2022, 12:55 PengHui Li <peng...@apache.org> ha
> > scritto:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > The load data don't need to be persistent to the storage layer,
> > > > > > Using a non-persistent topic is more efficient.
> > > > > >
> > > > > > Thanks,
> > > > > > Penghui
> > > > > >
> > > > > > On Wed, Mar 16, 2022 at 2:14 PM Kai Wang
> > > <kw...@streamnative.io.invalid>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Pulsar Community,
> > > > > > >
> > > > > > > Currently, Pulsar LoadManager is using Zookeeper to store the
> > local
> > > > > > broker
> > > > > > > data, the LoadReportUpdaterTask will report the local load data
> > to
> > > > > > > Zookeeper, the leader broker will collect load data and store it
> > to
> > > > > > > Zookeeper.
> > > > > > >
> > > > > > > When we have a lot of brokers and bundles, this load datas will
> > put
> > > > > some
> > > > > > > pressure on Zookeeper.
> > > > > > >
> > > > > > > Since the load data are not strongly consistent, we can use the
> > > > > > > non-persistent topics to sync the load data. And it will reduce
> > our
> > > > > > > dependence on Zookeeper.
> > > > > > >
> > > > > > > If this proposal is acceptable, I will draft a PIP.
> > > > > > >
> > > > > > > Any suggestions are appreciated.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Kai
> > > > > > >
> > > > > >
> > > > >
> > >
> >

Reply via email to