good work, lgtm

Lari Hotari <lhot...@apache.org>于2025年6月24日 周二02:44写道:

> Dear Pulsar Community,
>
> I'd like to propose PIP-430, which addresses performance and
> efficiency issues in Pulsar broker's entry cache eviction mechanisms
> and introduces a more efficient caching strategy.
>
> The current broker entry cache implementation has several
> production-impacting issues. The size-based eviction doesn't guarantee
> removal of globally oldest entries, leading to suboptimal cache
> utilization. More critically, the timestamp-based eviction iterates
> through all ManagedLedgers every 10ms by default, causing high CPU
> utilization in brokers with many topics. Mixed read patterns like
> tailing, catch-up, and Key_Shared replays break eviction assumptions,
> resulting in unnecessary BookKeeper and S3 reads that increase
> operational costs.
>
> PIP-430 introduces two main improvements.
> First, a centralized eviction mechanism using a global
> RangeCacheRemovalQueue that tracks all cached entries in insertion
> order. This replaces the expensive per-ledger iteration with a single
> periodic task and ensures true oldest-first eviction globally. The
> implementation PR for this part is
> https://github.com/apache/pulsar/pull/24363.
> Second, a new "expected read count" cache strategy where entries track
> how many active cursors are anticipated to read them. This allows the
> cache to intelligently retain entries that have higher utility,
> especially in high fan-out catch-up read scenarios and Key_Shared
> subscriptions.
>
> The benefits include reduced CPU overhead, improved cache hit rates
> through better eviction decisions, and proper handling of diverse read
> patterns. The new strategy is configurable via
> cacheEvictionByExpectedReadCount (default: true) and maintains full
> backward compatibility with no client-facing API changes.
>
> This addresses long-standing performance issues that particularly
> affect production deployments with high topic counts or diverse
> consumption patterns. The refactored architecture also provides a
> solid foundation for future cache optimizations.
>
> The full proposal can be found at:
> https://github.com/apache/pulsar/pull/24444
> Rendered PIP document:
> https://github.com/lhotari/pulsar/blob/lh-pip-430/pip/pip-430.md
>
> I welcome your feedback and discussion on this proposal. Please share
> your thoughts, concerns, or suggestions.
>
> -Lari
>

Reply via email to