Dear Pulsar Community, I'd like to propose PIP-430, which addresses performance and efficiency issues in Pulsar broker's entry cache eviction mechanisms and introduces a more efficient caching strategy.
The current broker entry cache implementation has several production-impacting issues. The size-based eviction doesn't guarantee removal of globally oldest entries, leading to suboptimal cache utilization. More critically, the timestamp-based eviction iterates through all ManagedLedgers every 10ms by default, causing high CPU utilization in brokers with many topics. Mixed read patterns like tailing, catch-up, and Key_Shared replays break eviction assumptions, resulting in unnecessary BookKeeper and S3 reads that increase operational costs. PIP-430 introduces two main improvements. First, a centralized eviction mechanism using a global RangeCacheRemovalQueue that tracks all cached entries in insertion order. This replaces the expensive per-ledger iteration with a single periodic task and ensures true oldest-first eviction globally. The implementation PR for this part is https://github.com/apache/pulsar/pull/24363. Second, a new "expected read count" cache strategy where entries track how many active cursors are anticipated to read them. This allows the cache to intelligently retain entries that have higher utility, especially in high fan-out catch-up read scenarios and Key_Shared subscriptions. The benefits include reduced CPU overhead, improved cache hit rates through better eviction decisions, and proper handling of diverse read patterns. The new strategy is configurable via cacheEvictionByExpectedReadCount (default: true) and maintains full backward compatibility with no client-facing API changes. This addresses long-standing performance issues that particularly affect production deployments with high topic counts or diverse consumption patterns. The refactored architecture also provides a solid foundation for future cache optimizations. The full proposal can be found at: https://github.com/apache/pulsar/pull/24444 Rendered PIP document: https://github.com/lhotari/pulsar/blob/lh-pip-430/pip/pip-430.md I welcome your feedback and discussion on this proposal. Please share your thoughts, concerns, or suggestions. -Lari