Here's the early draft implementation and proof-of-concept of PIP-379: https://github.com/apache/pulsar/pull/23352
Thanks for all the positive feedback about PIP-379. It has been very helpful in improving the initial idea. However, I haven't yet had time to address the review comments. I'll update the PIP document based on the feedback and this proof-of-concept implementation asap. This proof-of-concept is designed to be improved to handle the implementation of PIP-379. I've been putting different pieces in place without adding much test coverage for the new "units" that have been added. There were multiple questions about the data structures for storing the "draining hashes." This is now clearly visible in the proof-of-concept. The "DrainingHashesTracker" embeds the data structure for tracking the sticky key hashes that are "draining" and block delivery: https://github.com/lhotari/pulsar/blob/lh-pip-379-implementation/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/DrainingHashesTracker.java The memory-optimized Int2ObjectOpenHashMap implementation is from the Apache licensed project, fastutil, https://fastutil.di.unimi.it/. Memory usage has been one of the main concerns in not implementing hash-based tracking before. This PoC proves the PIP-379 approach, and we could later do benchmarks to ensure that the memory calculations are also valid. One detail in the PR is that a change was made to ConsistentHashingStickyKeyConsumerSelector to reduce the hash space from 0...Integer.MAX_VALUE-1 to 0..2^15-1. This reduction will also reduce memory consumption without negatively impacting the behavior. The PR also includes changes from another PR that is a bug fix to ConsistentHashingStickyKeyConsumerSelector, which is https://github.com/apache/pulsar/pull/23327. Once 23327 is merged, that won't show up in the diff anymore. In this PoC, I have completely removed PIP-282 before adding the PoC implementation for PIP-379. In the final version, I'm planning to add a configuration setting to be able to use the recently joined consumers logic before PIP-282 so that users upgrading from 3.x could fall back to it if there are issues with the PIP-379 implementation. I don't see any reason why that would be the case, but it's better to be safe than sorry when working with Key_Shared implementation. :) -Lari On 2024/09/14 14:19:59 Lari Hotari wrote: > Dear Pulsar Community, > > I'd like to propose a new improvement for Pulsar's Key_Shared > subscription mode, outlined in PIP-379. This proposal aims to address > several issues with the current implementation and introduce a more > efficient mechanism for managing message ordering. > > Problem: > The current Key_Shared implementation faces challenges including: > 1. Complex management of "recently joined consumers" > 2. Incomplete fulfillment of ordering guarantees > 3. Unnecessary message blocking > 4. Poor observability > > PIP-379 introduces a "draining hashes" concept to efficiently manage > message ordering by tracking affected hashes when consumer assignments > change. The high-level solution is drafted in the PIP document. > > Benefits: > 1. Improved message ordering guarantees > 2. Reduced unnecessary message blocking > 3. Better scalability and performance > 4. Enhanced observability > > This proposal would replace the existing "recently joined consumers" > mechanism, addressing its limitations while providing a more robust > solution. > > The full proposal can be found at: https://github.com/apache/pulsar/pull/23309 > The direct link to the rendered version of the markdown file is: > https://github.com/lhotari/pulsar/blob/lh-pip-379/pip/pip-379.md > > I welcome your feedback and discussion on this proposal. Please share > your thoughts, concerns, or suggestions. > > -Lari >