On Jun 21, 2022, at 1:00 AM, Haiting Jiang <[email protected]> wrote:
>
> Hi Pulsar community:
>
> I open a pip to discuss "Shadow Topic, an alternative way to support readonly
> topic ownership."
>
> Proposal Link: https://github.com/apache/pulsar/issues/16153
>
> ---
>
> ## Motivation
>
> The motivation is the same as PIP-63[1], with a new broadcast use case of
> supporting 100K subscriptions in a single topic.
> 1. The bandwidth of a broker limits the number of subscriptions for a single
> topic.
> 2. Subscriptions are competing for the network bandwidth on brokers. Different
> subscriptions might have different levels of severity.
> 3. When synchronizing cross-city message reading, cross-city access needs to
> be minimized.
> 4. [New] Broadcast with 100K subscriptions. There is a limitation of the
> subscription number of a single topic. It's tested by Hongjie from NTT Lab
> that with 40K subscriptions in a single topic, the client needs about 20min
> to start all client connections, and under 1 msg/s message producer rate,
> the average end to end latency is about 2.9s. And for 100K subscriptions,
> the time of start connection and E2E latency is beyond consideration.
Have you tested performance of two topics each with 40k subscriptions at the
same time in the same cluster?
I think that might simulate the notion of shadow topics in action and see if
much performance is actually gained by this notion of splitting.
It seems to me that a better approach would be to have multiple local pulsar
clusters and balance the subscriptions between those.
I’m concerned that this shadow topic approach is adding new complexity to
Pulsar without a clear understanding of all of the impacts.
Thanks,
Dave
>
> However, it's too complicated to implement with original PIP-63 proposal, the
> changed code is already over 3K+ lines, see PR#11960[2], and there are still
> some problems left,
> 1. The LAC in readonly topic is updated in a polling pattern, which increases
> the bookie load bookie.
> 2. The message data of readonly topic won't be cached in broker. Increase the
> network usage between broker and bookie when there are more than one
> subscriber is tail-reading.
> 3. All the subscriptions is managed in original writable-topic, so the support
> max subscription number is not scaleable.
>
> This PIP tries to come up with a simpler solution to support readonly topic
> ownership and solve the problems the previous PR left. The main idea of this
> solution is to reuse the feature of geo-replication, but instead of
> duplicating storage, it shares underlying bookie ledgers between different
> topics.
>
> ## Goal
>
> The goal is to introduce **Shadow Topic** as a new type of topic to support
> readonly topic ownership. Just as its name implies, a shadow topic is the
> shadow of some normal persistent topic (let's call it source topic here). The
> source topic and the shadow topic must have the same number of partitions or
> both non-partitioned. Multiply shadow topics can be created from a source
> topic.
>
> Shadow topic shares the underlying bookie ledgers from its source topic. User
> can't produce any messages to shadow topic directly and shadow topic don't
> create any new ledger for messages, all messages in shadow topic come from
> source topic.
>
> Shadow topic have its own subscriptions and don't share with its source topic.
> This means the shadow topic have its own cursor ledger to store persistent
> mark-delete info for each persistent subscriptions.
>
> The message sync procedure of shadow topic is supported by shadow replication,
> which is very like geo-replication, with these difference:
> 1. Geo-replication only works between topic with the same name in different
> broker clusters. But shadow topic have no naming limitation and they can be
> in the same cluster.
> 2. Geo-replication duplicates data storage, but shadow topic don't.
> 3. Geo-replication replicates data from each other, it's bidirectional, but
> shadow replication only have one way data flow.
>
>
> ## API Changes
>
> 1. PulsarApi.proto.
>
> Shadow topic need to know the original message id of the replicated messages,
> in order to update new ledger and lac. So we need add a `shadow_message_id` in
> CommandSend for replicator.
>
> ```
> message CommandSend { // ... // message id for shadow topic optional
> MessageIdData shadow_message_id = 9; }
> ```
>
> 2. Admin API for creating shadow topic with source topic
> ```
> admin.topics().createShadowTopic(source-topic-name, shadow-topic-name)
> ```
>
> ## Implementation
>
> A picture showing key components relations is added in github issue [3].
>
> There are two key changes for implementation.
> 1. How to replicate messages to shadow topics.
> 2. How shadow topic manage shared ledgers info.
>
> ### 1. How to replicate messages to shadow topics.
>
> This part is mostly implemented by `ShadowReplicator`, which extends
> `PersistentReplicator` introduced in geo-replication. The shadow topic list
> is added as a new topic policy of the source topic. Source topic manage the
> lifecycle of all the replicators. The key is to add `shadow_message_id` when
> produce message to shadow topics.
>
> ### 2. How shadow topic manage shared ledgers info.
>
> This part is mostly implemented by `ShadowManagedLedger`, which extends
> current `ManagedLedgerImpl` with two key override methods.
>
> 1. `initialize(..)`
> a. Fetch ManagedLedgerInfo of source topic instead of current shadow topic.
> The source topic name is stored in the topic policy of the shadow topic.
> b. Open the last ledger and read the explicit LAC from bookie, instead of
> creating new ledger. Reading LAC here requires that the source topic must
> enable explicit LAC feature by set `bookkeeperExplicitLacIntervalInMills`
> to non-zero value in broker.conf.
> c. Do not start checkLedgerRollTask, which tries roll over ledger periodically
>
> 2. `internalAsyncAddEntry()` Instead of write entry data to bookie, It only
> update metadata of ledgers, like `currentLedger`, `lastConfirmedEntry` and
> put the replicated message into `EntryCache`.
>
> Besides, some other problems need to be taken care of.
> - Any ledger metadata updates need to be synced to shadow topic, including
> ledger offloading or ledger deletion. Shadow topic needs to watch the ledger
> info updates with metadata store and update in time.
> - The local cached LAC of `LedgerHandle` won't updated in time, so we need
> refresh LAC when a managed cursor requests entries beyond known LAC.
>
> ## Reject Alternatives
>
> See PIP-63[1].
>
> ## Reference
> [1]
> https://github.com/apache/pulsar/wiki/PIP-63%3A-Readonly-Topic-Ownership-Support
> [2] https://github.com/apache/pulsar/pull/11960
> [3] https://github.com/apache/pulsar/issues/16153
>
>
> BR,
> Haiting Jiang