Dezhi, thank you for sharing the proposal! It is great to see Tencent started contributing this great feature back to Pulsar! This feature will unlock a lot of new capabilities of Pulsar.
I have moved the proposal to https://github.com/apache/pulsar/wiki/PIP-63:-Readonly-Topic-Ownership-Support - Sijie On Thu, May 7, 2020 at 5:23 AM dezhi liu <liudezhi2...@gmail.com> wrote: > Hi all, > Here is a suggest (PIP) ReadOnly Topic Ownership Support > ------------ > # PIP-63: ReadOnly Topic Ownership Support > > * Author: Penghui LI, Jia Zhai, Sijie Guo, Dezhi Liu > > ## Motivation > People usually use Pulsar as an event-bus or event center to unify all > their message data or event data. > One same set of event data will usually be shared across multiple > applications. Problems occur when the number of subscriptions of same topic > increased. > > - The bandwidth of a broker limits the number of subscriptions for a single > topic. > - Subscriptions are competing the network bandwidth on brokers. Different > subscription might have different level of severity. > - When synchronizing cross-city message reading, cross-city access needs to > be minimized. > > This proposal is proposing adding readonly topic ownership support. If > Pulsar supports readonly ownership, users can then use it to setup a (few) > separated broker clusters for readonly, to segregate the consumption > traffic by their service severity. And this would also allow Pulsar > supporting large number of subscriptions. > > ## Changes > There are a few key changes for supporting readonly topic ownership. > > - how does readonly topic owner read data > - how does readonly topic owner keep metadata in-sync > - how does readonly topic owner handle acknowledges > > The first two problems have been well addressed in DistributedLog. We can > just add similar features in managed ledger. > > ### How readonly topic owner read data > > In order for a readonly topic owner keep reading data in a streaming way, > the managed ledger should be able to refresh its LAC. The easiest change > is to call `readLastAddConfirmedAsync` when a cursor requests entries > beyond existing LAC. A more advanced approach is to switch the regular read > entries request to bookkeeper’s long poll read requests. However long poll > read requests are not support in the bookkeeper v2 protocol. > > Required Changes: > > - Refresh LastAddConfirmed when a managed cursor requests entries beyond > known LAC. > - Enable `explicitLac` at managed ledger. So the topic writable owner will > periodically advance LAC, which will make sure readonly owner will be able > to catch with the latest data. > > ### How readonly topic owner keep metadata in-sync > > Ledgers are rolled at a given interval. Readonly topic owner should find a > way to know the ledgers has been rolled. There are a couple of options. > These options are categorized into two approaches : notification vs > polling. > > *Notification* > > A) use zookeeper watcher. Readonly topic owner will set a watcher at the > managed ledger’s metadata. So it will be notified when a ledger is rolled. > B) similar as A), introduce a “notification” request between readonly topic > owner and writable topic owner. Writable topic owner notifies readonly > topic owner with metadata changes. > > *Polling* > > C) Readonly Broker polling zookeeper to see if there is new metadata, > *only* when LAC in the last ledger has not been advanced for a given > interval. Readonly Broker checks zookeeper to see if there is a new ledger > rolled. > D)Readonly Broker polling new metadata by read events from system topic of > write broker cluster, write broker add the ledger meta change events to the > system topic when mledger metadata update. > > Solution C) will be the simplest solution to start with > > ### How does readonly topic owner handle acknowledges > > Currently Pulsar deploys a centralized solution for managing cursors and > use cursors for managing data retention. This PIP will not change this > solution. Instead, readonly topic owner will only maintains a cursor cache, > all the actual cursor updates will be sent back to the writable topic > owner. > > This requires introducing a set of “cursor” related RPCs between writable > topic owner and readonly topic owners. > > - Read `Cursor` of a Subscription > > So readonly topic owner will handle following requests using these new > cursor RPCs > > - Subscribe : forward the subscribe request to writable topic owner. Upon > successfully subscribe, readonly topic owner caches the corresponding > cursor. > - Unsubscribe: remove cursor from cursor cache, and forward the unsubscribe > request to writable topic owner. > - Consume: when a consumer is connected, it will then `read` the cursor > from writable topic owner and cache it locally. > - Ack: forward the ack request to the writable topic owner, and update the > cursor locally in the cache. > > ## Compatibility, Deprecation and Migration Plan > Since most of the changes are internally changes to managed ledger, and it > is a new feature which doesn’t change pulsar’s wire protocol and public > api. There is no backward compatibility issue. > > It is a newly added feature. So there is nothing to deprecate or migrate. > > ## Test Plan > - Unit tests for each individual change > - Integration tests for end-to-end pipeline > - Chaos testing to ensure correctness > - Load testing for ensuring performance > > ## Rejected Alternatives > ### Use Geo Replication to replicate data between clusters > > A simplest alternative solution would be using Pulsar’s built-in > geo-replication mechanism to replicate data from one cluster to the other > cluster. > > #### Two completely separated clusters > > The idea is pretty straightforward - You created two separated clusters, > one cluster is for your online services - `Cluster-A`, while the other > cluster is for your analytical workloads - `Cluster-B`. `ClusterA` is used > for serving both write (produce) and read (consume) traffic, while > `ClusterB` is used for serving readonly (consume) traffic. Both `Cluster-A` > and `Cluster-B` have their own zookeeper cluster, bookkeeper cluster, and > brokers. In order to make sure a topic’s data can be replicated between > `Cluster-A` and `Cluster-B`, we need do make sure `Cluster-A` and > `Cluster-B` sharing same configuration storage. There are two approaches to > do so: > > a) a completely separated zookeeper cluster as configuration storage. > > In this approach, everything is completely separated. So you can treat > these two clusters just as two different regions, and follow the > instructions in [Pulsar geo-replication · Apache Pulsar]( > http://pulsar.apache.org/docs/en/administration-geo/) to setup data > replication between these two clusters. > > b) `ClusterB` and `ClusterA` share same configuration storage. > > The approach in a) requires setting up a separate zookeeper cluster as > configuration storage. But since `ClusterA` and `ClusterB` already have > their own zookeeper clusters, you don’t want to setup another zookeeper > cluster. You can let both `ClusterA` and `ClusterB` use `ClusterA`’s > zookeeper cluster as the configuration store. You can achieve it using > zookeeper’s chroot mechanism to put configuration data in a separate root > in `ClusterA`’s zookeeper cluster. > > For example: > > - Command to initialize `ClusterA`’s metadata > > ``` > $ bin/pulsar initialize-cluster-metadata \ > --cluster ClusterA \ > --zookeeper zookeeper.cluster-a.example.com:2181 \ > --configuration-store > zookeeper.cluster-a.example.com:2181/configuration-store \ > --web-service-url http://broker.cluster-a.example.com:8080/ \ > --broker-service-url pulsar://broker.cluster-a.example.com:6650/ > ``` > > - Command to initialize `ClusterB`’s metadata > ``` > $ bin/pulsar initialize-cluster-metadata \ > --cluster ClusterB \ > --zookeeper zookeeper.cluster-b.example.com:2181 \ > --configuration-store > zookeeper.cluster-a.example.com:2181/configuration-store \ > --web-service-url http://broker.cluster-b.example.com:8080/ \ > --broker-service-url pulsar://broker.cluster-b.example.com:6650/ > ``` > > #### Shared bookkeeper and zookeeper cluster, but separated brokers > > Sometimes it is unaffordable to have two completely separated clusters. You > might want to share the existing infrastructures, such as data storage > (bookkeeper) and metadata storage (zookeeper). Similar as the b) solution > described above, you can use zookeeper chroot to achieve that. > > Let’s assume there is only one zookeeper cluster and one bookkeeper > cluster. The zookeeper cluster is `zookeeper.shared.example.com:2181`. > You have two clusters of brokers, one cluster of broker is ` > broker-a.example.com`, and the other broker cluster is ` > broker-b.example.com`. > So when you create the clusters, you can use ` > zookeeper.shared.example.com:2181/configuration-store` > <http://zookeeper.shared.example.com:2181/configuration-store> as the > shared > configuration storage, and use ` > zookeeper.shared.example.com:2181/cluster-a`for > <http://zookeeper.shared.example.com:2181/cluster-afor> `ClusterA`’s > local metadata > storage, and use `zookeeper.shared.example.com:2181/cluster-b` > <http://zookeeper.shared.example.com:2181/cluster-b> for > `ClusterB`’s local metadata storage. > > This would allows you have two “broker-separated” clusters sharing same > storage cluster (both zookeeper and bookkeeper). > > No matter how the physical clusters are setup, there is a downside of using > geo-replications for isolating the online workloads and analytics workloads > - data has to be replicated at least twice, if you have configured pulsar > topics to store data in 3 replicas, you will end up have at least 6 copies > of data. So “geo-replication” might not be ideal for addressing this use > case. > > ------------ > > > Thanks, > > Dezhi >