Hi Xiangying, I think we can rename this PIP to: *Introduce `allowed-clusters` and `topic-policy-synchronized-clusters` to fully support replication on message and topic level* Currently, we can set replication clusters on the message and topic level, but the replication clusters should be a subset of the namespace replication clusters. which means : If we set namespace replication clusters: cluster1, cluster2, cluster3, at most, these three or two clusters can be set on message or topic. If the user wanna set cluster4 or others, the replication can't work as expected. It's easy to reproduce by this test:
> @Test public void testEnableReplicationInTopicLevel() throws Exception { // 1. Create namespace and topic String namespace = > BrokerTestUtil.newUniqueName("pulsar/testEnableReplicationInTopicLevel"); String topic1 = "persistent://" + namespace + "/topic-1"; admin1.namespaces().createNamespace(namespace); admin1.topics().createNonPartitionedTopic(topic1); // 2. Configure replication clusters for the topic. admin1.topics().setReplicationClusters(topic1, List.of("r1", "r2")); // 3. Check if the replicator connected successfully. Awaitility.await().atMost(5, TimeUnit.MINUTES).untilAsserted(() -> { List<String> keys = pulsar1.getBrokerService() .getTopic(topic1, false).get().get() .getReplicators().keys(); assertEquals(keys.size(), 1); assertTrue(pulsar1.getBrokerService() .getTopic(topic1, false).get().get() .getReplicators().get(keys.get(0)).isConnected()); }); } To fully support the replication, we find out an easy way to solve it. Introduce `allowed-clusters` on namespace policies, which Xiangying explains above. How could this work and solve the issue? The same example : If we set namespace replication clusters: cluster1, cluster2, cluster3, and set topic1 replication clusters: cluster2, cluster4. set topic2 replication clusters: cluster1, cluster4. We must set `allowed-clusters` with cluster1, cluster2, cluster3, and cluster4. The broker side will validate the topic or message replication clusters from the `allowed-cluster.` In this way, we can simplify more codes and logic here. For *`topic-policy-synchronized-clusters` *we also add examples in the PIP. Hope the explanation could help @Rajan @Girish Regards Jiwei Guo (Tboy) On Thu, Dec 7, 2023 at 10:29 PM Xiangying Meng <xiangy...@apache.org> wrote: > Hi Girish, > > I'm very pleased that we have reached some consensus now. Pulsar already > supports geo-replication at the topic level, but the existing > implementation of this topic level replication does not match our > expectations. > > At the moment, I can think of three directions to solve this problem: > > 1. Treat this issue as a bug and fix it so that Pulsar can truly support > replication at the topic level. > 2. Limit the replication topic policy, so that the replication clusters at > the topic level must be included in the replication clusters configured at > the namespace level. In this case, the topic level replication would serve > as a supplement to the namespace replication, rather than a true topic > level policy. > 3. Remove topic level replication. > > I lean towards the first option, as it would make Pulsar's replication > configuration more flexible and would not break the previous behavior > logic. > > >Yes, that's my viewpoint. In case that's not your view point, then in your > >use cases do you ever have more than one namespace inside a tenant? > >With every property coming at topic level, it makes no sense for the > >namespace hierarchy to exist anymore. > > I didn't propose this from the perspective of a user, but from the > perspective of a Pulsar maintainer. The replication cluster at the topic > level cannot function independently like other topic policies, and I > attempted to fix it after finding the reason. > > From the user's perspective, I could modify my system to put topics with > the same replication strategy under the same namespace. From the > maintainer's perspective, if a feature can help users use Pulsar more > flexibly and conveniently without introducing risks, then this feature > should be implemented. Perhaps business systems do not want to maintain too > many namespaces, as they would need to configure multiple namespace > policies or it might make their business logic complex. The other > configurations for topics under this namespace might be consistent, with > only a few topics needing to enable replication. In this case, topic level > replication becomes valuable. Therefore, I lean towards the first option, > to solve this problem and make it a truly expected topic policy. > > On Thu, Dec 7, 2023 at 12:45 PM Girish Sharma <scrapmachi...@gmail.com> > wrote: > > > Hello Xiangying, > > > > > > On Thu, Dec 7, 2023 at 6:32 AM Xiangying Meng <xiangy...@apache.org> > > wrote: > > > > > Hi Girish, > > > > > > What you are actually opposing is the implementation of true > topic-level > > > geo-replication. You believe that topics should be divided into > different > > > namespaces based on replication. Following this line of thought, what > we > > > should do is restrict the current topic-level replication settings, not > > > allowing the replication clusters set at the topic level to exceed the > > > range of replication clusters set in the namespace. > > > > > > > Yes, that's my viewpoint. In case that's not your view point, then in > your > > use cases do you ever have more than one namespace inside a tenant? > > With every property coming at topic level, it makes no sense for the > > namespace hierarchy to exist anymore. > > > > > > > > > > One point that confuses me is that we provide a setting for topic-level > > > replication clusters, but it can only be used to amend the namespace > > > settings and cannot work independently. Isn't this also a poor design > for > > > Pulsar? > > > > > > > This feature was originally added in pulsar without a PIP. And the PR [0] > > also doesn't have much context around why it was needed and why it is > being > > added.. So I can't comment on why this was added.. > > But my understanding is that even in a situation when the topics are > > divided into proper namespaces based on use cases and suddenly there is > an > > exceptional need for one of the existing topics to have lesser > replication, > > then instead of following a long exercise of moving that topic to a new > > namespace, you can use this feature. > > > > [0] - https://github.com/apache/pulsar/pull/12136 > > > > > > > > > > > > On Thu, Dec 7, 2023 at 2:28 AM Girish Sharma <scrapmachi...@gmail.com> > > > wrote: > > > > > > > Hello, replies inline. > > > > > > > > On Wed, Dec 6, 2023 at 5:28 PM Xiangying Meng <xiangy...@apache.org> > > > > wrote: > > > > > > > > > Hi Girish, > > > > > > > > > > Thank you for your explanation. Because Joe's email referenced the > > > > current > > > > > implementation of Pulsar, I misunderstood him to be saying that > this > > > > > current implementation is not good. > > > > > > > > > > A possible use case is where there is one or a small number of > topics > > > in > > > > > the namespace that store important messages, which need to be > > > replicated > > > > to > > > > > other clusters. Meanwhile, other topics only need to store data in > > the > > > > > local cluster. > > > > > > > > > > > > > Is it not possible to simply have the other topics in a namespace > which > > > > allows for that other cluster, and the local topics remain in the > > > namespace > > > > with local cluster needs. Seems to me like a proper use case of two > > > > different namespaces as the use case is different in both cases. > > > > > > > > > > > > > > > > > > > > > > For example, only topic1 needs replication, while topic2 to > topic100 > > do > > > > > not. According to the current implementation, we need to set > > > replication > > > > > clusters at the namespace level (e.g. cluster1 and cluster2), and > > then > > > > set > > > > > the topic-level replication clusters (cluster1) for topic2 to > > topic100 > > > to > > > > > exclude them. It's hard to say that this is a good design. > > > > > > > > > > > > > No, all you need is to put topic1 in namespace1 and topic2 to > topic100 > > in > > > > namespace2 . This is exactly what me and Joe were saying is a bad > > design > > > > choice that you are clubbing all 100 topics in same namespace. > > > > > > > > > > > > > > > > > > > > > > Best regards. > > > > > > > > > > On Wed, Dec 6, 2023 at 12:49 PM Joe F <joefranc...@gmail.com> > wrote: > > > > > > > > > > > Girish, > > > > > > > > > > > > Thank you for making my point much better than I did .. > > > > > > > > > > > > -Joe > > > > > > > > > > > > On Tue, Dec 5, 2023 at 1:45 AM Girish Sharma < > > > scrapmachi...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > Hello Xiangying, > > > > > > > > > > > > > > I believe what Joe here is referring to as "application design" > > is > > > > not > > > > > > the > > > > > > > design of pulsar or namespace level replication but the design > of > > > > your > > > > > > > application and the dependency that you've put on topic level > > > > > > replication. > > > > > > > > > > > > > > In general, I am aligned with Joe from an application design > > > > > standpoint. > > > > > > A > > > > > > > namespace is supposed to represent a single application use > case, > > > > topic > > > > > > > level override of replication clusters helps in cases where > there > > > > are a > > > > > > few > > > > > > > exceptional topics which do not need replication in all of the > > > > > namespace > > > > > > > clusters. This helps in saving network bandwidth, storage, CPU, > > RAM > > > > etc > > > > > > > > > > > > > > But the reason why you've raised this PIP is to bring down the > > > actual > > > > > > > replication semantics at a topic level. Yes, namespace level > > still > > > > > exists > > > > > > > as per your PIP as well, but is basically left only to be a > > > "default > > > > in > > > > > > > case topic level is missing". > > > > > > > This brings me to a very basic question - What's the use case > > that > > > > you > > > > > > are > > > > > > > trying to solve that needs these changes? Because, then what's > > > > stopping > > > > > > us > > > > > > > from bringing every construct that's at a namespace level > > > (bundling, > > > > > > > hardware affinity, etc) down to a topic level? > > > > > > > > > > > > > > Regards > > > > > > > > > > > > > > On Tue, Dec 5, 2023 at 2:52 PM Xiangying Meng < > > > xiangy...@apache.org> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Joe, > > > > > > > > > > > > > > > > You're correct. The initial design of the replication policy > > > leaves > > > > > > room > > > > > > > > for improvement. To address this, we aim to refine the > cluster > > > > > settings > > > > > > > at > > > > > > > > the namespace level in a way that won't impact the existing > > > system. > > > > > The > > > > > > > > replication clusters should solely be used to establish full > > mesh > > > > > > > > replication for that specific namespace, without having any > > other > > > > > > > > definitions or functionalities. > > > > > > > > > > > > > > > > BR, > > > > > > > > Xiangying > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 4, 2023 at 1:52 PM Joe F <joefranc...@gmail.com> > > > > wrote: > > > > > > > > > > > > > > > > > >if users want to change the replication policy for > > > > > > > > > topic-n and do not change the replication policy of other > > > topics, > > > > > > they > > > > > > > > need > > > > > > > > > to change all the topic policy under this namespace. > > > > > > > > > > > > > > > > > > This PIP unfortunately flows from attempting to solve bad > > > > > > application > > > > > > > > > design > > > > > > > > > > > > > > > > > > A namespace is supposed to represent an application, and > the > > > > > > namespace > > > > > > > > > policy is an umbrella for a similar set of policies that > > > applies > > > > > to > > > > > > > all > > > > > > > > > topics. The exceptions would be if a topic had a need for > a > > > > > deficit, > > > > > > > The > > > > > > > > > case of one topic in the namespace sticking out of the > > > namespace > > > > > > policy > > > > > > > > > umbrella is bad application design in my opinion > > > > > > > > > > > > > > > > > > -Joe. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Dec 3, 2023 at 6:00 PM Xiangying Meng < > > > > > xiangy...@apache.org> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Rajan and Girish, > > > > > > > > > > Thanks for your reply. About the question you mentioned, > > > there > > > > is > > > > > > > some > > > > > > > > > > information I want to share with you. > > > > > > > > > > >If anyone wants to setup different replication clusters > > then > > > > > > either > > > > > > > > > > >those topics can be created under different namespaces > or > > > > > defined > > > > > > at > > > > > > > > > topic > > > > > > > > > > >level policy. > > > > > > > > > > > > > > > > > > > > >And users can anyway go and update the namespace's > cluster > > > > list > > > > > to > > > > > > > add > > > > > > > > > the > > > > > > > > > > >missing cluster. > > > > > > > > > > Because the replication clusters also mean the clusters > > where > > > > the > > > > > > > topic > > > > > > > > > can > > > > > > > > > > be created or loaded, the topic-level replication > clusters > > > can > > > > > only > > > > > > > be > > > > > > > > > the > > > > > > > > > > subset of namespace-level replication clusters. > > > > > > > > > > Just as Girish mentioned, the users can update the > > > namespace's > > > > > > > cluster > > > > > > > > > list > > > > > > > > > > to add the missing cluster. > > > > > > > > > > But there is a problem because the replication clusters > as > > > the > > > > > > > > namespace > > > > > > > > > > level will create a full mesh replication for that > > namespace > > > > > across > > > > > > > the > > > > > > > > > > clusters defined in > > > > > > > > > > replication-clusters if users want to change the > > replication > > > > > policy > > > > > > > for > > > > > > > > > > topic-n and do not change the replication policy of other > > > > topics, > > > > > > > they > > > > > > > > > need > > > > > > > > > > to change all the topic policy under this namespace. > > > > > > > > > > > > > > > > > > > > > Pulsar is being used by many legacy systems and > changing > > > its > > > > > > > > > > >semantics for specific usecases without considering > > > > consequences > > > > > > are > > > > > > > > > > >creating a lot of pain and incompatibility problems for > > > other > > > > > > > existing > > > > > > > > > > >systems and let's avoid doing it as we are struggling > with > > > > such > > > > > > > > changes > > > > > > > > > > and > > > > > > > > > > >breaking compatibility or changing semantics are just > not > > > > > > > acceptable. > > > > > > > > > > > > > > > > > > > > This proposal will not introduce an incompatibility > > problem, > > > > > > because > > > > > > > > the > > > > > > > > > > default value of the namespace policy of allowed-clusters > > and > > > > > > > > > > topic-policy-synchronized-clusters are the > > > > replication-clusters. > > > > > > > > > > > > > > > > > > > > >Allowed clusters defined at tenant level > > > > > > > > > > >will restrict tenants to create namespaces in > > > regions/clusters > > > > > > where > > > > > > > > > they > > > > > > > > > > >are not allowed. > > > > > > > > > > >As Rajan also mentioned, allowed-clusters field has a > > > > different > > > > > > > > > > meaning/purpose. > > > > > > > > > > > > > > > > > > > > Allowed clusters defined at the tenant level will > restrict > > > > > tenants > > > > > > > from > > > > > > > > > > creating namespaces in regions/clusters where they are > not > > > > > allowed. > > > > > > > > > > Similarly, the allowed clusters defined at the namespace > > > level > > > > > will > > > > > > > > > > restrict the namespace from creating topics in > > > regions/clusters > > > > > > where > > > > > > > > > they > > > > > > > > > > are not allowed. > > > > > > > > > > What's wrong with this? > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > Xiangying > > > > > > > > > > > > > > > > > > > > On Fri, Dec 1, 2023 at 2:35 PM Girish Sharma < > > > > > > > scrapmachi...@gmail.com> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi Xiangying, > > > > > > > > > > > > > > > > > > > > > > Shouldn't the solution to the issue mentioned in #21564 > > [0] > > > > > > mostly > > > > > > > be > > > > > > > > > > > around validating that topic level replication clusters > > are > > > > > > subset > > > > > > > of > > > > > > > > > > > namespace level replication clusters? > > > > > > > > > > > It would be a completely compatible change as even > today > > > the > > > > > case > > > > > > > > > where a > > > > > > > > > > > topic has a cluster not defined in namespaces's > > > > > > > replication-clusters > > > > > > > > > > > doesn't really work. > > > > > > > > > > > And users can anyway go and update the namespace's > > cluster > > > > list > > > > > > to > > > > > > > > add > > > > > > > > > > the > > > > > > > > > > > missing cluster. > > > > > > > > > > > > > > > > > > > > > > As Rajan also mentioned, allowed-clusters field has a > > > > different > > > > > > > > > > > meaning/purpose. > > > > > > > > > > > Regards > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 30, 2023 at 10:56 AM Xiangying Meng < > > > > > > > > xiangy...@apache.org> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hi, Pulsar Community > > > > > > > > > > > > > > > > > > > > > > > > I drafted a proposal to make the configuration of > > > clusters > > > > at > > > > > > the > > > > > > > > > > > namespace > > > > > > > > > > > > level clearer. This helps solve the problem of > > > > > geo-replication > > > > > > > not > > > > > > > > > > > working > > > > > > > > > > > > correctly at the topic level. > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/pulsar/pull/21648 > > > > > > > > > > > > > > > > > > > > > > > > I'm looking forward to hearing from you. > > > > > > > > > > > > > > > > > > > > > > > > BR > > > > > > > > > > > > Xiangying > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > Girish Sharma > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Girish Sharma > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Girish Sharma > > > > > > > > > > > > > -- > > Girish Sharma > > >