On Jun 21, 2022, at 1:00 AM, Haiting Jiang <jianghait...@apache.org> wrote: > > Hi Pulsar community: > > I open a pip to discuss "Shadow Topic, an alternative way to support readonly > topic ownership." > > Proposal Link: https://github.com/apache/pulsar/issues/16153 > > --- > > ## Motivation > > The motivation is the same as PIP-63[1], with a new broadcast use case of > supporting 100K subscriptions in a single topic. > 1. The bandwidth of a broker limits the number of subscriptions for a single > topic. > 2. Subscriptions are competing for the network bandwidth on brokers. Different > subscriptions might have different levels of severity. > 3. When synchronizing cross-city message reading, cross-city access needs to > be minimized. > 4. [New] Broadcast with 100K subscriptions. There is a limitation of the > subscription number of a single topic. It's tested by Hongjie from NTT Lab > that with 40K subscriptions in a single topic, the client needs about 20min > to start all client connections, and under 1 msg/s message producer rate, > the average end to end latency is about 2.9s. And for 100K subscriptions, > the time of start connection and E2E latency is beyond consideration.
Have you tested performance of two topics each with 40k subscriptions at the same time in the same cluster? I think that might simulate the notion of shadow topics in action and see if much performance is actually gained by this notion of splitting. It seems to me that a better approach would be to have multiple local pulsar clusters and balance the subscriptions between those. I’m concerned that this shadow topic approach is adding new complexity to Pulsar without a clear understanding of all of the impacts. Thanks, Dave > > However, it's too complicated to implement with original PIP-63 proposal, the > changed code is already over 3K+ lines, see PR#11960[2], and there are still > some problems left, > 1. The LAC in readonly topic is updated in a polling pattern, which increases > the bookie load bookie. > 2. The message data of readonly topic won't be cached in broker. Increase the > network usage between broker and bookie when there are more than one > subscriber is tail-reading. > 3. All the subscriptions is managed in original writable-topic, so the support > max subscription number is not scaleable. > > This PIP tries to come up with a simpler solution to support readonly topic > ownership and solve the problems the previous PR left. The main idea of this > solution is to reuse the feature of geo-replication, but instead of > duplicating storage, it shares underlying bookie ledgers between different > topics. > > ## Goal > > The goal is to introduce **Shadow Topic** as a new type of topic to support > readonly topic ownership. Just as its name implies, a shadow topic is the > shadow of some normal persistent topic (let's call it source topic here). The > source topic and the shadow topic must have the same number of partitions or > both non-partitioned. Multiply shadow topics can be created from a source > topic. > > Shadow topic shares the underlying bookie ledgers from its source topic. User > can't produce any messages to shadow topic directly and shadow topic don't > create any new ledger for messages, all messages in shadow topic come from > source topic. > > Shadow topic have its own subscriptions and don't share with its source topic. > This means the shadow topic have its own cursor ledger to store persistent > mark-delete info for each persistent subscriptions. > > The message sync procedure of shadow topic is supported by shadow replication, > which is very like geo-replication, with these difference: > 1. Geo-replication only works between topic with the same name in different > broker clusters. But shadow topic have no naming limitation and they can be > in the same cluster. > 2. Geo-replication duplicates data storage, but shadow topic don't. > 3. Geo-replication replicates data from each other, it's bidirectional, but > shadow replication only have one way data flow. > > > ## API Changes > > 1. PulsarApi.proto. > > Shadow topic need to know the original message id of the replicated messages, > in order to update new ledger and lac. So we need add a `shadow_message_id` in > CommandSend for replicator. > > ``` > message CommandSend { // ... // message id for shadow topic optional > MessageIdData shadow_message_id = 9; } > ``` > > 2. Admin API for creating shadow topic with source topic > ``` > admin.topics().createShadowTopic(source-topic-name, shadow-topic-name) > ``` > > ## Implementation > > A picture showing key components relations is added in github issue [3]. > > There are two key changes for implementation. > 1. How to replicate messages to shadow topics. > 2. How shadow topic manage shared ledgers info. > > ### 1. How to replicate messages to shadow topics. > > This part is mostly implemented by `ShadowReplicator`, which extends > `PersistentReplicator` introduced in geo-replication. The shadow topic list > is added as a new topic policy of the source topic. Source topic manage the > lifecycle of all the replicators. The key is to add `shadow_message_id` when > produce message to shadow topics. > > ### 2. How shadow topic manage shared ledgers info. > > This part is mostly implemented by `ShadowManagedLedger`, which extends > current `ManagedLedgerImpl` with two key override methods. > > 1. `initialize(..)` > a. Fetch ManagedLedgerInfo of source topic instead of current shadow topic. > The source topic name is stored in the topic policy of the shadow topic. > b. Open the last ledger and read the explicit LAC from bookie, instead of > creating new ledger. Reading LAC here requires that the source topic must > enable explicit LAC feature by set `bookkeeperExplicitLacIntervalInMills` > to non-zero value in broker.conf. > c. Do not start checkLedgerRollTask, which tries roll over ledger periodically > > 2. `internalAsyncAddEntry()` Instead of write entry data to bookie, It only > update metadata of ledgers, like `currentLedger`, `lastConfirmedEntry` and > put the replicated message into `EntryCache`. > > Besides, some other problems need to be taken care of. > - Any ledger metadata updates need to be synced to shadow topic, including > ledger offloading or ledger deletion. Shadow topic needs to watch the ledger > info updates with metadata store and update in time. > - The local cached LAC of `LedgerHandle` won't updated in time, so we need > refresh LAC when a managed cursor requests entries beyond known LAC. > > ## Reject Alternatives > > See PIP-63[1]. > > ## Reference > [1] > https://github.com/apache/pulsar/wiki/PIP-63%3A-Readonly-Topic-Ownership-Support > [2] https://github.com/apache/pulsar/pull/11960 > [3] https://github.com/apache/pulsar/issues/16153 > > > BR, > Haiting Jiang