I think this is a very appropriate direction to take Pulsar's geo-replication. Your proposal is essentially to make the inter-cluster configuration event driven. This increases fault tolerance and better decouples clusters.
Thank you for your detailed proposal. After reading through it, I have some questions :) 1. What do you think about using protobuf to define the event protocol? I know we already have a topic policy event stream defined with Java POJOs, but since this feature is specifically designed for egressing cloud providers, ensuring compact data transfer would keep egress costs down. Additionally, protobuf can help make it clear that the schema is strict, should evolve thoughtfully, and should be designed to work between clusters of different versions. 2. In your view, which tenant/namespace will host `metadataSyncEventTopic`? Will there be several of these topics or is it just hosted in a system tenant/namespace? This question gets back to my questions about system topics on this mailing list last week [0]. I view this topic as a system topic, so we'd need to make sure that it has the right authorization rules and that it won't be affected by calls like "clearNamespaceBacklog". 3. Which broker will host the metadata update publisher? I assume we want the producer to be collocated with the bundle that hosts the event topic. How will this be coordinated? 4. Why isn't a topic a `ResourceType`? Is this because the topic level policies already have this feature? If so, is there a way to integrate this feature with the existing topic policy feature? 5. By decentralizing the metadata store, it looks like there is a chance for conflicts due to concurrent updates. How do we handle those conflicts? I'll also note that I previously proposed a system event topic here [1] and it was proposed again here [2]. Those features were for different use cases, but ultimately looked very similar. In my view, a stream of system events is a very natural feature to expect in a streaming technology. I wonder if there is a way to generalize this feature to fulfill local cluster consumers and geo-replication consumers. Even if this PIP only implements the geo-replication portion of the feature, it'd be good to design it in an extensible fashion. Thanks, Michael [0] https://lists.apache.org/thread/pj4n4wzm3do8nkc52l7g7obh0sktzm17 [1] https://lists.apache.org/thread/h4cbvwjdomktsq2jo66x5qpvhdrqk871 [2] https://lists.apache.org/thread/0xkg0gpsobp0dbgb6tp9xq097lpm65bx On Sun, Jan 30, 2022 at 10:33 PM Rajan Dhabalia <rdhaba...@apache.org> wrote: > > Hi, > > I would like to start a discussion about PIP-136: Sync Pulsar policies > across multiple clouds. > > PIP documentation: https://github.com/apache/pulsar/issues/13728 > > *Motivation* > Apache Pulsar is a cloud-native, distributed messaging framework which > natively provides geo-replication. Many organizations deploy pulsar > instances on-prem and on multiple different cloud providers and at the same > time they would like to enable replication between multiple clusters > deployed in different cloud providers. Pulsar already provides various > proxy options (Pulsar proxy/ enterprise proxy solutions on SNI) to fulfill > security requirements when brokers are deployed on different security zones > connected with each other. However, sometimes it's not possible to share > metadata-store (global zookeeper) between pulsar clusters deployed on > separate cloud provider platforms, and synchronizing configuration metadata > (policies) can be a critical path to share tenant/namespace/topic policies > between clusters and administrate pulsar policies uniformly across all > clusters. Therefore, we need a mechanism to sync configuration metadata > between clusters deployed on the different cloud platforms. > > *Sync Pulsar policies across multiple clouds* > https://github.com/apache/pulsar/issues/13728 > Prototype git-hub-link > <https://github.com/rdhabalia/pulsar/commit/e59803b942918076ce6376b50b35ca827a49bcf6> > Thanks, > Rajan