Bumping this thread. Looking forward to some feedback. Thanks! -Lari
On 2024/03/18 10:32:03 Lari Hotari wrote: > Hi all, > > I'd like to start a discussion about improving Pulsar broker cache design. > > In the Pulsar broker, there are two main scenarios for message > dispatching to consumers: tailing (hot) reads and catch-up (cold) > reads. In both scenarios consumers can be fast or slow which also > impacts the scenarios. > The Pulsar broker contains a cache for handling tailing reads. This > cache was extended to handle catch-up reads with PR #12258 (+other > follow-up PRs) in Pulsar 2.11. > This cache is referred to as the "broker cache" or more specifically, > the "managed ledger cache". > > The issue "Slow backlog draining with high fanout topics" > https://github.com/apache/pulsar/issues/12257 describes a scenario why > the caching was extended to handle catch-up reads (PR #12258, in > Pulsar 2.11): > > "When there are topics with a high publish rate and high fan-out, with > a large number of subscriptions or a large number of replicators, and > a backlog is built, it becomes really difficult to drain the backlog > for such topics while they are serving a high publish rate. This > problem can be reproduced with multiple topics (100), each with 10K > writes, 6 backlog replicators, and multiple subscriptions. It becomes > impossible to drain backlogs even if most of the cursors are draining > a similar backlog because a large number of cold-reads from the bookie > makes the overall backlog draining slower. Even the bookkeeper cache > is not able to keep up by caching entries and eventually has to do > cold-reads." (edited) > > The problem described above could also occur in other cases. Under > heavy load, a high fan-out catch-up read could increase the load of > the system and it could cause a cascading failure which results in > partial outages and backlogs that could take hours to recover from. In > many cases, this could be avoided with a more optimal broker cache > implementation. > > Optimizations in the broker cache are crucial for improving Pulsar > performance. Unlike Kafka, the Pulsar broker doesn't have access to > local files and cannot leverage the Linux page cache for efficient > caching. > The cache implementation in the Pulsar broker should be intelligent to > avoid unnecessary load on the system. However, there have been very > few improvements and focus on improving the Pulsar broker cache. > > For most production configurations, understanding the tuning of the > broker cache is necessary. However, we don't have much documentation > in the Apache Pulsar project. I noticed this in 2022 when I was > preparing my talk "Performance tuning for Apache Pulsar Kubernetes > deployments" for ApacheCon 2022 (slides [1], talk [2]). The situation > hasn't improved since then in the project documentation. There are > better performance-related tutorials than my talk available, for > example, "Apache Pulsar Troubleshooting Performance Issues" by Hang > Chen [3] and "Apache Pulsar Troubleshooting backlog Issues" by Penghui > Li [4]. It would be great if this type of content were summarized in > the Apache Pulsar project documentation. It can be hard to find the > relevant information when it is needed. > > The main intention of this email is not about improving documentation. > It is to highlight that the current implementation could be > significantly improved. > The broker cache should be designed in such a way that it minimizes > cache misses and maximizes cache hits within the available broker > cache memory. > I believe this can be achieved by developing an algorithm that > incorporates an optimization model for making optimal caching > decisions. > > In addition to a better caching model and algorithm, there are > opportunities to leverage rate limiting to improve caching for both > tailing reads and catch-up reads. > For instance, instead of rate limiting individual consumers, consumers > could be dynamically grouped in a way where the speed at which the > group moves forward could be rate limited so that cache hits are > maximized and consumer side speed difference wouldn't cause the > consumers to fall out of the cached range. When consumers are catching > up to the tail, it doesn't always make sense to impose a rate limit > for an individual consumer if the entries to read are already cached, > as rate limiting would only increase the cache size and increase the > chance that the consumer falls out of the cached range. > > Would anyone else be interested in contributing to the design and > improvement of the Pulsar broker cache? > It would also be useful to hear about real user experiences with > current problems in high fan-out scenarios in Pulsar. > > - Lari > > 1 - > https://www.apachecon.com/acna2022/slides/03_Hotari_Lari_Performance_tuning_Pulsar.pdf > 2 - https://www.youtube.com/watch?v=WkdfILAx-4c > 3 - https://www.youtube.com/watch?v=8_4bVctj2_E > 4 - https://www.youtube.com/watch?v=17jQIOVeu4s >