This proposal only considers using non-persistent topics to sync bundle load data, the historical load data will still write into zookeeper, I think we can draft another proposal for persistent historical load data to the system topic.
Thanks, Kai On Mar 18, 2022, 02:54 +0800, Joe F <joefranc...@gmail.com>, wrote: > IIRC, there is a historical load profile for topic that feeds into > decisions by the load balancer. > > What happens during a cluster startup, with this new proposal? > > > > > On Thu, Mar 17, 2022 at 7:50 AM PengHui Li <peng...@apache.org> wrote: > > > > But which brokers will own that topic ? > > in a Pulsar cluster with a high level of isolation of tenants, we must > > ensure that: > > - at least one broker is allowed to own the topic > > - brokers dedicated to tenants do not own the topic > > With the current approach the data in on zookeeper, and this is shared > > among all the brokers > > > > We have "pulsar/system" namespace which can be used to maintain > > system topics. If users consider broker isolation, it's all transparent. > > > > Using a topic we also can shared the data among all brokers. > > Who want a data copy, only need to create a reader when starting. > > And we have introduced table view, which will make it easier to cache > > the load data, and perform the load cache update. > > > > > Another point: > > will users be allowed to produce/consume this topic ? how do we deal > > with permissions = > > > > Good point. We should avoid the user's producers/consumers, and only > > the super user can access the system topic. > > > > Thanks, > > Penghui > > > > On Thu, Mar 17, 2022 at 10:08 PM Enrico Olivelli <eolive...@gmail.com> > > wrote: > > > > > Il giorno gio 17 mar 2022 alle ore 02:42 PengHui Li > > > <peng...@apache.org> ha scritto: > > > > > > > > > we do not know > > > > anything about the availability of the owner of the topic. > > > > > > > > If the owner broker is not available, other brokers will take over. > > > > > > > > > We could make it simpler and when a broker wants to push its data, it > > > > looks > > > > up the REST address of the "leader broker" and then pushes the data to > > > it, > > > > I mean, without involving a "topic" > > > > > > > > Any broker may become the leader broker, in this case, the brokers need > > > to > > > > know all the addresses of the brokers in the cluster. With the topic > > > > approach, > > > > they only need to know the topic name. > > > > > > I thought about this a little more. > > > Using a non persistent topic makes sense. So I am closer to be > > > convinced about this move. > > > > > > But which brokers will own that topic ? > > > in a Pulsar cluster with a high level of isolation of tenants, we must > > > ensure that: > > > - at least one broker is allowed to own the topic > > > - brokers dedicated to tenants do not own the topic > > > With the current approach the data in on zookeeper, and this is shared > > > among all the brokers > > > > > > Another point: > > > will users be allowed to produce/consume this topic ? how do we deal > > > with permissions = > > > > > > > > > Enrico > > > > > > > > > > > Penghui > > > > > > > > On Thu, Mar 17, 2022 at 12:35 AM Enrico Olivelli <eolive...@gmail.com> > > > > wrote: > > > > > > > > > But in order to read from a topic you need a broker that is the owner > > > of > > > > > the owner of the special "temporary topic". > > > > > > > > > > While the metadata service (ZooKeeper) is already a central point and > > > it is > > > > > meant to be available (otherwise Pulsar doesn't work), we do not know > > > > > anything about the availability of the owner of the topic. > > > > > > > > > > Or do you mean to create a special topic that is always owned by the > > > > > "leader broker" ? > > > > > > > > > > We could make it simpler and when a broker wants to push its data, it > > > looks > > > > > up the REST address of the "leader broker" and then pushes the data > > to > > > it, > > > > > I mean, without involving a "topic". > > > > > > > > > > > > > > > Enrico > > > > > > > > > > > > > > > > > > > > Il Mer 16 Mar 2022, 12:55 PengHui Li <peng...@apache.org> ha > > scritto: > > > > > > > > > > > +1 > > > > > > > > > > > > The load data don't need to be persistent to the storage layer, > > > > > > Using a non-persistent topic is more efficient. > > > > > > > > > > > > Thanks, > > > > > > Penghui > > > > > > > > > > > > On Wed, Mar 16, 2022 at 2:14 PM Kai Wang > > > <kw...@streamnative.io.invalid> > > > > > > wrote: > > > > > > > > > > > > > Hi Pulsar Community, > > > > > > > > > > > > > > Currently, Pulsar LoadManager is using Zookeeper to store the > > local > > > > > > broker > > > > > > > data, the LoadReportUpdaterTask will report the local load data > > to > > > > > > > Zookeeper, the leader broker will collect load data and store it > > to > > > > > > > Zookeeper. > > > > > > > > > > > > > > When we have a lot of brokers and bundles, this load datas will > > put > > > > > some > > > > > > > pressure on Zookeeper. > > > > > > > > > > > > > > Since the load data are not strongly consistent, we can use the > > > > > > > non-persistent topics to sync the load data. And it will reduce > > our > > > > > > > dependence on Zookeeper. > > > > > > > > > > > > > > If this proposal is acceptable, I will draft a PIP. > > > > > > > > > > > > > > Any suggestions are appreciated. > > > > > > > > > > > > > > Thanks, > > > > > > > Kai > > > > > > > > > > > > > > > > > > > > > > >