> This feature allows users to sync two separate clusters which are having independent global-zookeeper (metadata-store) instances. So, this feature will not be limited to sync a few fields of policies but think in terms of auto creation and syncing namespaces/tenants to entirely different metadata-stores where they can not talk to each other.
Yes, we are on the same page. If this proposal is for syncing tenant/namespace/topic between multiple metadata stores, it looks good to me. If the proposal also wants to sync the namespace policy between multiple metadata stores, I think we should leverage the existing __change_events topic under each namespace. > Again, #12136 talks about applying policies to topics but this PIP addresses a separate problem where it makes it possible to integrate independent/isolated metadata-stores with each other. Yes, it's not the same problem, but related. The namespace can disabled the geo-replication but enabled in topic level, in this case if we have multiple metadata store, and if the topic is a partitioned topic, we also need a way to sync the partitioned metadata to the remote cluster's metadata store? Thanks, Penghui On Wed, Mar 23, 2022 at 3:39 AM Rajan Dhabalia <rdhaba...@apache.org> wrote: > >> Do we need to provide the ability for users to decide to replicate the > ACLs and replication cluster or not? > > This feature allows users to sync two separate clusters which are having > independent global-zookeeper (metadata-store) instances. So, this feature > will not be limited to sync a few fields of policies but think in terms of > auto creation and syncing namespaces/tenants to entirely different > metadata-stores where they can not talk to each other. > > >> BTW, we already supported topic level replication cluster > configuration[5], looks like in this case, [5] > https://github.com/apache/pulsar/pull/12136 > > Again, #12136 talks about applying policies to topics but this PIP > addresses a separate problem where it makes it possible to integrate > independent/isolated metadata-stores with each other. > > Thanks, > Rajan > > On Mon, Mar 21, 2022 at 6:29 PM PengHui Li <peng...@apache.org> wrote: > > > Thanks for the explanation. > > > > > yes, local policies doesn't need to be replicate to other clusters and > it > > will only replicate global policies which is shared across multiple > > clusters such tenant/namespace's identity-creation, ACLs, replication > > clusters, etc. > > > > As described in this blog [1], section "Aggregation Replication". > > Do we need to provide the ability for users to decide to replicate > > the ACLs and replication cluster or not? > > > > Currently, if users want to achieve "Aggregation Replication", it needs > > multiple configuration stores. So they need to maintain the namespace, > > partitioned topics in each cluster. A new namespace created in one > cluster, > > it need to create the namespace in other clusters if they want to > replicate > > data to those clusters. > > > > After this proposal, they don't need to create a namespace for other > > clusters, > > Pulsar will help to replicate the configuration store changes to the > > replicated cluster, > > if the new created namespace with replication cluster A, B, and C in > > cluster A, the > > namespace will be replicated to B and C. > > > > But for the ACLs and replication clusters, it should be controlled by > > users? > > e.g. only replicate the namespace to B and C, but not the replication > > clusters and ACLs. > > So that we can achieve "Aggregation Replication" with this proposal. > > > > > Topic that will be used to share policies across clusters is > configurable > > and it can be named anything. However, we should keep it a separate topic > > as it requires unique schema and special handling to synchronize policies > > across the clusters. > > > > Yes, looks like currently we already have a mechanism to replicate > > policies. > > We have a system topic under the namespace "__change_events", which only > > has > > topic policy changes for now. It can replicate anything under a > namespace. > > We have defined "EventType"[2] in PulsarEvent(structure used in > > "__change_events"). > > And we already have a implementation for selective PulsarEvent > > replication[3], and schema > > replication[4]. > > > > So it looks like we can use the "__change_events" to replicate namespace > > policies, and use a > > new topic which belongs to a system namespace to replicate > > tenant/namespace's identity-creation, > > partitioned topic creation? > > > > BTW, we already supported topic level replication cluster > configuration[5], > > looks like in this case, > > the partitioned topic is created first in one cluster without replication > > clusters first, after the replication > > clusters changed, pulsar will replicate the partitioned topic to remote > > cluster. The same mechanism is > > required for non-partitioned topics(users might disabled the topic > > auto-creation). > > > > [1] > > > > > https://www.splunk.com/en_us/blog/devops/geo-replication-in-apache-pulsar-part-2-patterns-and-practices.html > > [2] > > > > > https://github.com/apache/pulsar/blob/4dcb166e0bfcce7fc85fd8d59a25b881f6f9c6fa/pulsar-common/src/main/java/org/apache/pulsar/common/events/PulsarEvent.java#L36 > > [3] > > > > > https://github.com/apache/pulsar/wiki/PIP-92%3A-Topic-policy-across-multiple-clusters > > [4] > > > > > https://github.com/apache/pulsar/wiki/PIP-88%3A-Replicate-schemas-across-multiple > > [5] https://github.com/apache/pulsar/pull/12136 > > > > Regards, > > Penghui > > > > On Tue, Mar 22, 2022 at 6:37 AM Rajan Dhabalia <rdhaba...@apache.org> > > wrote: > > > > > >> If it contains namespace policy replication, There are some policies > > no > > > need to replicate to another cluster > > > yes, local policies doesn't need to be replicate to other clusters and > it > > > will only replicate global policies which is shared across multiple > > > clusters such tenant/namespace's identity-creation, ACLs, replication > > > clusters, etc. > > > > > > >> The new partitioned topic also needs to be replicated to the remote > > > cluster? > > > Yes. > > > > > > Topic that will be used to share policies across clusters is > configurable > > > and it can be named anything. However, we should keep it a separate > topic > > > as it requires unique schema and special handling to synchronize > policies > > > across the clusters. > > > > > > Thanks, > > > Rajan > > > > > > On Fri, Mar 18, 2022 at 9:12 PM PengHui Li <peng...@apache.org> wrote: > > > > > > > Hi Rajan, > > > > > > > > Thanks for the great proposal. > > > > > > > > Will all the namespace policies be replicated to the remote cluster? > > > > I noticed the PIP title mentioned policies, but looks like from the > > > > `MetadataChangeEvent`, > > > > no namespaces policies defined. If it contains namespace policy > > > > replication, > > > > There are some policies no need to replicate to another cluster, > > > > for example, the rate limiter, max producers/consumers limiter. > > > > In > > > > > > > > > > > > > > https://github.com/apache/pulsar/wiki/PIP-92%3A-Topic-policy-across-multiple-clusters > > > > , > > > > it introduced a --global option to provide ability to apply the > policy > > in > > > > global or local. > > > > > > > > The new partitioned topic also needs to be replicated to the remote > > > > cluster? > > > > > > > > Currently, we already have a PulsarEvent struct to define the pulsar > > > system > > > > events, > > > > Looks like we can use a unified event definition by PulsarEvent. > > > > > > > > Others look good to me. > > > > > > > > Regards, > > > > Penghui > > > > > > > > > > > > > > > > On Sat, Mar 19, 2022 at 1:32 AM Joe F <joefranc...@gmail.com> wrote: > > > > > > > > > +1 > > > > > > > > > > On Thu, Mar 17, 2022 at 12:07 PM Rajan Dhabalia < > > rdhaba...@apache.org> > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > I would like to start VOTE on PIP-136: > > > > > > https://github.com/apache/pulsar/issues/13728 > > > > > > > > > > > > Thanks, > > > > > > Rajan > > > > > > > > > > > > On Tue, Feb 8, 2022 at 4:58 PM Rajan Dhabalia < > > dhabalia...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > >> How do we designate the host broker? Is it manual? How does > it > > > > work > > > > > > > when the host broker is removed from the cluster? > > > > > > > No, it will not be manual but as I explained earlier a broker > > which > > > > > has a > > > > > > > failover consumer to consume remote events will be the > publisher > > > for > > > > > > > metadata update. If that broker is removed then a new failover > > > > > > > consumer/broker will be selected for the same. > > > > > > > > > > > > > > >> I look forward to seeing more about this design for conflict > > > > > > resolution. > > > > > > > Sure, I have updated PIP to handle such race condition: > > > > > > https://github.com/apache/pulsar/issues/13728 > > > > > > > > > > > > > > > > > > > > > >> (1) scenarios where the Pulsar cluster operators and tenant > > > admins > > > > > > are > > > > > > > different entities and tenants can be malicious, or more > > probably, > > > > > write > > > > > > > bad code that will produce malicious outcomes. > > > > > > > I agree, Pulsar should have provision to prevent such scenarios > > > where > > > > > > > changes from one tenant in a cluster can impact other clusters. > > > This > > > > > PIP > > > > > > > considers the tenant/admin will be the same at both the ends > but > > > that > > > > > can > > > > > > > not be true in all cases. We can add an enhancement later or we > > can > > > > > > create > > > > > > > a separate PIP to start discussion with the possible solutions. > > > > > > > > > > > > > > Thanks, > > > > > > > Rajan > > > > > > > > > > > > > > On Thu, Feb 3, 2022 at 9:59 AM Joe F <joefranc...@gmail.com> > > > wrote: > > > > > > > > > > > > > >> >On my first reading, it wasn't clear if there was only one > > topic > > > > > > >> required for this feature. I now see that the topic is not > tied > > > to a > > > > > > >> specific tenant or namespace. As such, we can avoid > complicated > > > > > > >> authorization questions by putting the required event topic(s) > > > into > > > > a > > > > > > >> "system" tenant and namespace > > > > > > >> > > > > > > >> We should consider complicated questions. We can say why we > > chose > > > > not > > > > > to > > > > > > >> address it, or why it does not apply. for a particular > situation > > > > > > >> > > > > > > >> Many namespace policies are administered by tenants. As such > > any > > > > > tenant > > > > > > >> can load this topic. Is it possible for one abusive tenant to > > > make > > > > > your > > > > > > >> system topic dysfunctional? > > > > > > >> > > > > > > >> Pulsar committers should think about > > > > > > >> (1) scenarios where the Pulsar cluster operators and tenant > > admins > > > > > are > > > > > > >> different entities and tenants can be malicious, or more > > probably, > > > > > write > > > > > > >> bad code that will produce malicious outcomes. > > > > > > >> (2) whether the changes introduce additional SPOFs into the > > > > cluster. > > > > > > >> > > > > > > >> I don't think this PIP has those issues, but as a matter of > > > > > practice, I > > > > > > >> would like to see backend/system PIPs consider these questions > > > and > > > > > > >> explicitly state the conclusions with rationale > > > > > > >> > > > > > > >> Joe > > > > > > >> > > > > > > >> > > > > > > >> On Wed, Feb 2, 2022 at 9:27 PM Michael Marshall < > > > > mmarsh...@apache.org > > > > > > > > > > > > >> wrote: > > > > > > >> > > > > > > >> > Thanks for your responses. > > > > > > >> > > > > > > > >> > > I don't see a need of protobuf for this particular usecase > > > > > > >> > > > > > > > >> > If no one else feels strongly on this point, I am good with > > > using > > > > a > > > > > > >> POJO. > > > > > > >> > > > > > > > >> > > It doesn't matter if it's system-topic or not because it's > > > > > > >> > > configurable and the admin of the system can decide and > > > > configure > > > > > it > > > > > > >> > > according to the required persistent policy. > > > > > > >> > > > > > > > >> > On my first reading, it wasn't clear if there was only one > > topic > > > > > > >> > required for this feature. I now see that the topic is not > > tied > > > > to a > > > > > > >> > specific tenant or namespace. As such, we can avoid > > complicated > > > > > > >> > authorization questions by putting the required event > topic(s) > > > > into > > > > > a > > > > > > >> > "system" tenant and namespace, by default. The > `pulsar/system` > > > > > tenant > > > > > > >> > and namespace seem appropriate to me. > > > > > > >> > > > > > > > >> > > I would keep the system topic > > > > > > >> > > separate because this topic serves a specific purpose with > > > > > specific > > > > > > >> > schema, > > > > > > >> > > replication policy and retention policy. > > > > > > >> > > > > > > > >> > I think we need a more formal definition for system topics. > > This > > > > > topic > > > > > > >> > is exactly the kind of topic I would call a system topic: > its > > > > > intended > > > > > > >> > producers and consumers are Pulsar components. However, > > because > > > > > > >> > this feature can live on a topic in a system namespace, we > can > > > > avoid > > > > > > >> > the classification discussion for this PIP. > > > > > > >> > > > > > > > >> > > Source region will have a broker which will create a > > failover > > > > > > >> consumer on > > > > > > >> > > that topic and a broker with an active consumer will watch > > the > > > > > > >> metadata > > > > > > >> > > changes and publish the changes to the event topic. > > > > > > >> > > > > > > > >> > How do we designate the host broker? Is it manual? How does > it > > > > work > > > > > > >> > when the host broker is removed from the cluster? > > > > > > >> > > > > > > > >> > If we collocate the active consumer with the broker hosting > > the > > > > > event > > > > > > >> > topic, can we skip creating the failover consumer? > > > > > > >> > > > > > > > >> > > PIP briefly talks about it but I will update the PIP with > > more > > > > > > >> > > explanation. > > > > > > >> > > > > > > > >> > I look forward to seeing more about this design for conflict > > > > > > resolution. > > > > > > >> > > > > > > > >> > Thanks, > > > > > > >> > Michael > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > On Tue, Feb 1, 2022 at 3:01 AM Rajan Dhabalia < > > > > > dhabalia...@gmail.com> > > > > > > >> > wrote: > > > > > > >> > > > > > > > > >> > > Please find my response inline. > > > > > > >> > > > > > > > > >> > > On Mon, Jan 31, 2022 at 9:17 PM Michael Marshall < > > > > > > >> mmarsh...@apache.org> > > > > > > >> > > wrote: > > > > > > >> > > > > > > > > >> > > > I think this is a very appropriate direction to take > > > Pulsar's > > > > > > >> > > > geo-replication. Your proposal is essentially to make > the > > > > > > >> > > > inter-cluster configuration event driven. This increases > > > fault > > > > > > >> > > > tolerance and better decouples clusters. > > > > > > >> > > > > > > > > > >> > > > Thank you for your detailed proposal. After reading > > through > > > > it, > > > > > I > > > > > > >> have > > > > > > >> > > > some questions :) > > > > > > >> > > > > > > > > > >> > > > 1. What do you think about using protobuf to define the > > > event > > > > > > >> > > > protocol? I know we already have a topic policy event > > stream > > > > > > >> > > > defined with Java POJOs, but since this feature is > > > > specifically > > > > > > >> > > > designed for egressing cloud providers, ensuring compact > > > data > > > > > > >> transfer > > > > > > >> > > > would keep egress costs down. Additionally, protobuf can > > > help > > > > > make > > > > > > >> it > > > > > > >> > > > clear that the schema is strict, should evolve > > thoughtfully, > > > > and > > > > > > >> > > > should be designed to work between clusters of different > > > > > versions. > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > >>> I don't see a need of protobuf for this particular > > > usecase > > > > > > >> because > > > > > > >> > of > > > > > > >> > > two reasons: > > > > > > >> > > >> a. policy changes don't generate huge traffic which > > could > > > > be > > > > > 1 > > > > > > >> rps > > > > > > >> > b. > > > > > > >> > > and it doesn't need performance optimization. > > > > > > >> > > >> It should be similar as storing policy in text > instead > > > > > protobuf > > > > > > >> > which > > > > > > >> > > doesn't impact footprint size or performance due to > limited > > > > number > > > > > > of > > > > > > >> > > >> update operations and relatively less complexity. I > > agree > > > > that > > > > > > >> > protobuf > > > > > > >> > > could be another option but in this case it's not needed. > > > Also, > > > > > POJO > > > > > > >> > > >> can also support schema and versioning. > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > 2. In your view, which tenant/namespace will host > > > > > > >> > > > `metadataSyncEventTopic`? Will there be several of these > > > > topics > > > > > or > > > > > > >> is > > > > > > >> > > > it just hosted in a system tenant/namespace? This > question > > > > gets > > > > > > back > > > > > > >> > > > to my questions about system topics on this mailing list > > > last > > > > > week > > > > > > >> > [0]. I > > > > > > >> > > > view this topic as a system topic, so we'd need to make > > sure > > > > > that > > > > > > it > > > > > > >> > > > has the right authorization rules and that it won't be > > > > affected > > > > > by > > > > > > >> > calls > > > > > > >> > > > like "clearNamespaceBacklog". > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > >> It doesn't matter if it's system-topic or not because > > > it's > > > > > > >> > > configurable and the admin of the system can decide and > > > > configure > > > > > it > > > > > > >> > > according to the required persistent policy. I would keep > > the > > > > > system > > > > > > >> > topic > > > > > > >> > > separate because this topic serves a specific purpose with > > > > > specific > > > > > > >> > schema, > > > > > > >> > > replication policy and retention policy. > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > 3. Which broker will host the metadata update > publisher? I > > > > > assume > > > > > > we > > > > > > >> > > > want the producer to be collocated with the bundle that > > > hosts > > > > > the > > > > > > >> > > > event topic. How will this be coordinated? > > > > > > >> > > > > > > > > > >> > > >> It's already explained into PIP in section: "Event > > > publisher > > > > > and > > > > > > >> > handler" > > > > > > >> > > >> Every isolated cluster deployed on a separate cloud > > > platform > > > > > will > > > > > > >> > have a > > > > > > >> > > source region and part of replicated clusters for the > event > > > > topic. > > > > > > The > > > > > > >> > > Source region will have a broker which will create a > > failover > > > > > > >> consumer on > > > > > > >> > > that topic and a broker with an active consumer will watch > > the > > > > > > >> metadata > > > > > > >> > > changes and publish the changes to the event topic. > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > 4. Why isn't a topic a `ResourceType`? Is this because > the > > > > topic > > > > > > >> level > > > > > > >> > > > policies already have this feature? If so, is there a > way > > to > > > > > > >> integrate > > > > > > >> > > > this feature with the existing topic policy feature? > > > > > > >> > > > > > > > > > >> > > >> Yes, ResourceType can be extensible to a topic as well. > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > 5. By decentralizing the metadata store, it looks like > > there > > > > is > > > > > a > > > > > > >> > > > chance for conflicts due to concurrent updates. How do > we > > > > handle > > > > > > >> those > > > > > > >> > > > conflicts? > > > > > > >> > > > > > > > > > >> > > >> PIP briefly talks about it but I will update the PIP > > with > > > > more > > > > > > >> > > explanation. MetadataChangeEvent contains source-cluster > and > > > > > updated > > > > > > >> > time. > > > > > > >> > > Also, resources Tenant/Namespace will also contain > > > > lastUpdatedTime > > > > > > >> which > > > > > > >> > > will help to destination clusters to handle > stale/duplicate > > > > events > > > > > > and > > > > > > >> > race > > > > > > >> > > conditions. Also, snapshot-sync an additional task helps > all > > > > > > clusters > > > > > > >> to > > > > > > >> > be > > > > > > >> > > synced with each other eventually. > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > I'll also note that I previously proposed a system event > > > topic > > > > > > here > > > > > > >> > > > [1] and it was proposed again here [2]. Those features > > were > > > > for > > > > > > >> > > > different use cases, but ultimately looked very similar. > > In > > > my > > > > > > >> view, a > > > > > > >> > > > stream of system events is a very natural feature to > > expect > > > > in a > > > > > > >> > > > streaming technology. I wonder if there is a way to > > > generalize > > > > > > this > > > > > > >> > > > feature to fulfill local cluster consumers and > > > geo-replication > > > > > > >> > > > consumers. Even if this PIP only implements the > > > > geo-replication > > > > > > >> > > > portion of the feature, it'd be good to design it in an > > > > > extensible > > > > > > >> > fashion. > > > > > > >> > > > > > > > > > >> > > >> I think answer (2) addresses this concern as well. > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > Thanks, > > > > > > >> > > > Michael > > > > > > >> > > > > > > > > > >> > > > [0] > > > > > > >> > > https://lists.apache.org/thread/pj4n4wzm3do8nkc52l7g7obh0sktzm17 > > > > > > >> > > > [1] > > > > > > >> > > https://lists.apache.org/thread/h4cbvwjdomktsq2jo66x5qpvhdrqk871 > > > > > > >> > > > [2] > > > > > > >> > > https://lists.apache.org/thread/0xkg0gpsobp0dbgb6tp9xq097lpm65bx > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > On Sun, Jan 30, 2022 at 10:33 PM Rajan Dhabalia < > > > > > > >> rdhaba...@apache.org> > > > > > > >> > > > wrote: > > > > > > >> > > > > > > > > > > >> > > > > Hi, > > > > > > >> > > > > > > > > > > >> > > > > I would like to start a discussion about PIP-136: Sync > > > > Pulsar > > > > > > >> > policies > > > > > > >> > > > > across multiple clouds. > > > > > > >> > > > > > > > > > > >> > > > > PIP documentation: > > > > > > https://github.com/apache/pulsar/issues/13728 > > > > > > >> > > > > > > > > > > >> > > > > *Motivation* > > > > > > >> > > > > Apache Pulsar is a cloud-native, distributed messaging > > > > > framework > > > > > > >> > which > > > > > > >> > > > > natively provides geo-replication. Many organizations > > > deploy > > > > > > >> pulsar > > > > > > >> > > > > instances on-prem and on multiple different cloud > > > providers > > > > > and > > > > > > at > > > > > > >> > the > > > > > > >> > > > same > > > > > > >> > > > > time they would like to enable replication between > > > multiple > > > > > > >> clusters > > > > > > >> > > > > deployed in different cloud providers. Pulsar already > > > > provides > > > > > > >> > various > > > > > > >> > > > > proxy options (Pulsar proxy/ enterprise proxy > solutions > > on > > > > > SNI) > > > > > > to > > > > > > >> > > > fulfill > > > > > > >> > > > > security requirements when brokers are deployed on > > > different > > > > > > >> security > > > > > > >> > > > zones > > > > > > >> > > > > connected with each other. However, sometimes it's not > > > > > possible > > > > > > to > > > > > > >> > share > > > > > > >> > > > > metadata-store (global zookeeper) between pulsar > > clusters > > > > > > >> deployed on > > > > > > >> > > > > separate cloud provider platforms, and synchronizing > > > > > > configuration > > > > > > >> > > > metadata > > > > > > >> > > > > (policies) can be a critical path to share > > > > > > tenant/namespace/topic > > > > > > >> > > > policies > > > > > > >> > > > > between clusters and administrate pulsar policies > > > uniformly > > > > > > across > > > > > > >> > all > > > > > > >> > > > > clusters. Therefore, we need a mechanism to sync > > > > configuration > > > > > > >> > metadata > > > > > > >> > > > > between clusters deployed on the different cloud > > > platforms. > > > > > > >> > > > > > > > > > > >> > > > > *Sync Pulsar policies across multiple clouds* > > > > > > >> > > > > https://github.com/apache/pulsar/issues/13728 > > > > > > >> > > > > Prototype git-hub-link > > > > > > >> > > > > < > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://github.com/rdhabalia/pulsar/commit/e59803b942918076ce6376b50b35ca827a49bcf6 > > > > > > >> > > > > > > > > > > >> > > > > Thanks, > > > > > > >> > > > > Rajan > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >