Il giorno lun 26 giu 2023 alle ore 09:21 Yubiao Feng <yubiao.f...@streamnative.io.invalid> ha scritto: > > Hi Yan,Asaf > > > I want to add only one step to your plan. > > If you introduce this flag in Y.X, then in Y.(X+1), > > let's remove this flag > > and keep the "true" value as the behavior. > > I agree with Asaf +1
Enrico > > Thanks > Yubiao Feng > > On Mon, Jun 19, 2023 at 9:57 AM horizonzy <horizo...@apache.org> wrote: > > > Background > > > > In the Pulsar, it has two features: > > > > - > > > > The first feature allows users to set group and rack information for > > bookies using pulsar-admin bookies set-bookie-rack. > > > > Here, users set bookie1 to bookie5 to the default group and bookie6 to > > bookie10 to the share group using commands, they don't care about rack > > information, they only care about which group the bookie belongs to. > > > > default={bookie1:3181=BookieInfoImpl(rack=default-rack, > > hostname=null), bookie2:3181=BookieInfoImpl(rack=default-rack, > > hostname=null), bookie3:3181=BookieInfoImpl(rack=default-rack, > > hostname=null), bookie4:3181=BookieInfoImpl(rack=default-rack, > > hostname=null), bookie5:3181=BookieInfoImpl(rack=default-rack, > > hostname=null)} > > > > _shared_={bookie6:3181=BookieInfoImpl(rack=default-rack, > > hostname=null), bookie7:3181=BookieInfoImpl(rack=default-rack, > > hostname=null), bookie8:3181=BookieInfoImpl(rack=default-rack, > > hostname=null), bookie9:3181=BookieInfoImpl(rack=default-rack, > > hostname=null), bookie10:3181=BookieInfoImpl(rack=default-rack, > > hostname=null)} > > > > > > - > > > > The second feature allows users to set the priority of traffic for a > > namespace, where traffic is directed to the primary group first and > > then to > > the secondary group. Users can set this priority using pulsar-admin > > ns-isolation-policy set --namespaces public/default --primary "group" > > --secondary "group". > > > > Here, users set the primary group of the /public/default namespace to > > "share" using a command. > > > > { > > "bookkeeperAffinityGroupPrimary" : "share" > > } > > > > After this work is completed, all traffic under the /public/default > > namespace will be directed to bookie6-10 in the "share" group. > > > > Drawbacks > > > > After a period of time, users added some new bookies [bk11, bk12, bk13, > > bk14, bk15] to the bookie cluster, they found that some traffic under the > > /public/default namespace was directed to the newly added machines. After > > investigation, we eventually found that this was a defect in the working > > mechanism of bookkeeperAffinityGroupPrimary. > > > > *bookkeeperAffinityGroupPrimary work mechanism* > > > > All bookies in the cluster: bk1-bk15. > > > > Here are the steps of the broker pick bookies. > > > > 1. > > > > Get the bookie rack info config default: [bk1, bk2, bk3, bk4, bk5]; > > share: > > [bk6, bk7, bk8, bk9, bk10] > > 2. > > > > Exclude the bookies which are not the bookkeeperAffinityGroupPrimary > > (share). > > 3. > > > > Exclude the default group bookies [bk1, bk2, bk3, bk4, bk5]. > > 4. > > > > Pick bookies from the remaining bookies [bk6, bk7, bk8, bk9, bk10, bk11, > > bk12, bk13, bk14, bk15] > > > > Therefore, some traffic may go to bk11-bk15, which is not what the users > > expect. The reason is that the new bookies, bk11 to bk15, did not have rack > > information set and were not part of any group. > > > > We provided a workaround for users to set the rack information for bk11 to > > bk15 in advance using the command pulsar-admin bookies set-bookie-rack > > before starting them. After user adopting this workaround, the traffic > > worked as expected. > > > > For user, it may be a bit inconvenient as they need to set rack information > > in advance before bringing new bookies online. In scenarios where there are > > strict limitations on traffic, if the bookie operation and maintenance > > personnel overlook this step, it could cause problems. > > > > Improvement > > > > I would like to introduce a new configuration strict for > > bookkeeperAffinityGroupPrimary. The default value for this configuration is > > false, which means that for old users upgrading to the new version, the > > logic will remain the same and bookies without rack information will not be > > constrained. > > > > If users manually set strict to true using the command pulsar-admin > > ns-isolation-policy set --namespaces public/default --primary "group" > > --secondary "group" --strict true, when the broker selects a bookie, it > > will only choose from the bookies in the primary group. If there are not > > enough bookies in the primary group, it will choose from the bookies in the > > secondary group. If there are not enough bookies in either group, an > > exception will be thrown. Bookies without rack information set using > > pulsar-admin > > bookies set-bookie-rack will not be selected. > > > > Compatibility > > > > When users upgrade from the old version to the new version, the working > > mechanism of bookkeeperAffinityGroupPrimary remains the same as before. > > When users upgrade to the new version and set strict to true using the > > command pulsar-admin ns-isolation-policy set --namespaces public/default > > --primary "group" --secondary "group" --strict true, and then roll back to > > the old version, the broker should be able to correctly parse the > > ns-isolation-policy configuration. > >