Re: [DISCUSS] Improving Pulsar broker cache design

Lari Hotari Fri, 22 Mar 2024 06:17:31 -0700

Bumping this thread. Looking forward to some feedback. 
Thanks!

-Lari


On 2024/03/18 10:32:03 Lari Hotari wrote:
> Hi all,
> 
> I'd like to start a discussion about improving Pulsar broker cache design.
> 
> In the Pulsar broker, there are two main scenarios for message
> dispatching to consumers: tailing (hot) reads and catch-up (cold)
> reads. In both scenarios consumers can be fast or slow which also
> impacts the scenarios.
> The Pulsar broker contains a cache for handling tailing reads. This
> cache was extended to handle catch-up reads with PR #12258 (+other
> follow-up PRs) in Pulsar 2.11.
> This cache is referred to as the "broker cache" or more specifically,
> the "managed ledger cache".
> 
> The issue "Slow backlog draining with high fanout topics"
> https://github.com/apache/pulsar/issues/12257 describes a scenario why
> the caching was extended to handle catch-up reads (PR #12258, in
> Pulsar 2.11):
> 
> "When there are topics with a high publish rate and high fan-out, with
> a large number of subscriptions or a large number of replicators, and
> a backlog is built, it becomes really difficult to drain the backlog
> for such topics while they are serving a high publish rate. This
> problem can be reproduced with multiple topics (100), each with 10K
> writes, 6 backlog replicators, and multiple subscriptions. It becomes
> impossible to drain backlogs even if most of the cursors are draining
> a similar backlog because a large number of cold-reads from the bookie
> makes the overall backlog draining slower. Even the bookkeeper cache
> is not able to keep up by caching entries and eventually has to do
> cold-reads." (edited)
> 
> The problem described above could also occur in other cases. Under
> heavy load, a high fan-out catch-up read could increase the load of
> the system and it could cause a cascading failure which results in
> partial outages and backlogs that could take hours to recover from. In
> many cases, this could be avoided with a more optimal broker cache
> implementation.
> 
> Optimizations in the broker cache are crucial for improving Pulsar
> performance. Unlike Kafka, the Pulsar broker doesn't have access to
> local files and cannot leverage the Linux page cache for efficient
> caching.
> The cache implementation in the Pulsar broker should be intelligent to
> avoid unnecessary load on the system. However, there have been very
> few improvements and focus on improving the Pulsar broker cache.
> 
> For most production configurations, understanding the tuning of the
> broker cache is necessary. However, we don't have much documentation
> in the Apache Pulsar project. I noticed this in 2022 when I was
> preparing my talk "Performance tuning for Apache Pulsar Kubernetes
> deployments" for ApacheCon 2022 (slides [1], talk [2]). The situation
> hasn't improved since then in the project documentation. There are
> better performance-related tutorials than my talk available, for
> example, "Apache Pulsar Troubleshooting Performance Issues" by Hang
> Chen [3] and "Apache Pulsar Troubleshooting backlog Issues" by Penghui
> Li [4]. It would be great if this type of content were summarized in
> the Apache Pulsar project documentation. It can be hard to find the
> relevant information when it is needed.
> 
> The main intention of this email is not about improving documentation.
> It is to highlight that the current implementation could be
> significantly improved.
> The broker cache should be designed in such a way that it minimizes
> cache misses and maximizes cache hits within the available broker
> cache memory.
> I believe this can be achieved by developing an algorithm that
> incorporates an optimization model for making optimal caching
> decisions.
> 
> In addition to a better caching model and algorithm, there are
> opportunities to leverage rate limiting to improve caching for both
> tailing reads and catch-up reads.
> For instance, instead of rate limiting individual consumers, consumers
> could be dynamically grouped in a way where the speed at which the
> group moves forward could be rate limited so that cache hits are
> maximized and consumer side speed difference wouldn't cause the
> consumers to fall out of the cached range. When consumers are catching
> up to the tail, it doesn't always make sense to impose a rate limit
> for an individual consumer if the entries to read are already cached,
> as rate limiting would only increase the cache size and increase the
> chance that the consumer falls out of the cached range.
> 
> Would anyone else be interested in contributing to the design and
> improvement of the Pulsar broker cache?
> It would also be useful to hear about real user experiences with
> current problems in high fan-out scenarios in Pulsar.
> 
> - Lari
> 
> 1 - 
> https://www.apachecon.com/acna2022/slides/03_Hotari_Lari_Performance_tuning_Pulsar.pdf
> 2 - https://www.youtube.com/watch?v=WkdfILAx-4c
> 3 - https://www.youtube.com/watch?v=8_4bVctj2_E
> 4 - https://www.youtube.com/watch?v=17jQIOVeu4s
>

Re: [DISCUSS] Improving Pulsar broker cache design

Reply via email to