Thank you for your thoughtful response and for raising these important questions, Rajan. I appreciate your concerns about rushing PRs and the potential for regression bugs. Let me address your points and explain why this particular PR is crucial for the upcoming release.
1. Urgency of the PR: The current issues in Key_Shared subscriptions could be considered serious problems themselves. PIP-379 aims to address these long-standing issues that have been causing significant problems for users. The PIP document clearly outlines the reasons why this implementation is needed, and it has been approved through our community process. 2. Inclusion in the ongoing release: While I understand your concern about potential regressions, it's important to note that the current behavior is already problematic. PIP-379 offers a solution to these issues, potentially improving stability and reliability for many users. 3. Addressing regression concerns: To mitigate the risk of regressions, we're implementing a feature toggle that will allow users to switch back to the current behavior if necessary. This provides a safety net for users who might encounter issues with the new implementation. This will be handled in a separate PR. The PR review comment [1] addresses this. 4. Review process: I kindly request that you review the implementation PR itself. This will allow us to make progress on this matter constructively, focusing on the technical aspects and potential improvements rather than getting sidetracked into unproductive discussions. 5. Long-term benefits: While there's always a risk with significant changes, the potential benefits of PIP-379 in terms of improved message ordering and system reliability outweigh the short-term risks, especially given the safeguards we're putting in place. I understand and share your concern about releasing stable versions. However, I believe that addressing these critical issues now will lead to a more stable and reliable Pulsar in the long run. The community has already approved this PIP, indicating its importance and potential impact. Let's focus our efforts on reviewing the implementation, identifying any potential issues, and working together to ensure that this change improves Pulsar for all users. Your expertise and input in this review process would be invaluable. I look forward to reviews on https://github.com/apache/pulsar/pull/23352 by EOD today. Thanks! -Lari On 2024/10/05 00:05:13 Rajan Dhabalia wrote: > Hi Lari, > > Thank you for your work but I have a simple question about why we should > rush for specific PR and why it's necessary to keep a particular PR part of > ongoing release. I know multiple PRs that tried to address critical bugs on > this similar issue and they were again blocked without solid reasons and I > still have a few concerns about this approach but I don't want to block > this proposal as this is addressing a long list of issues that can be > useful for key-shared sub, But if we rush and release a new Pulsar release > soon with these changes then there is a high chance of regression bugs in > the release. > > I know we should actively work to address such critical issues because I > know many other critical PRs that are solving fundamental and performance > problems are just sitting for years because they were blocked due to > unknown/personal reasons by reviewers, But I also wish to have this > critical PRs in the master branch for testing before releasing a new > Pulsar-Release because I am not sure if it's worth to release in rush at > the cost of unstable release. > > Thanks, > Rajan > > On Fri, Oct 4, 2024 at 3:21 PM Lari Hotari <lhot...@apache.org> wrote: > > > This is a friendly reminder about the urgent need for review of PR #23352, > > the first part of implementing "PIP-379: Key_Shared Draining Hashes for > > Improved Message Ordering". > > > > PR: https://github.com/apache/pulsar/pull/23352 > > Target: 4.0.0-preview.2 release, scheduled for Monday Oct 7th. > > > > Your prompt review would be greatly appreciated, ideally by end of day > > today. This will allow time for addressing feedback over the weekend. > > Thank you for your attention to this matter. > > > > -Lari > > > > On 2024/10/02 21:43:53 Lari Hotari wrote: > > > Dear Pulsar Community, > > > > > > I am excited to announce that the first PR implementing "PIP-379: > > > Key_Shared Draining Hashes for Improved Message Ordering" is now ready > > > for review. This PR contains the core implementation as described in > > > the PIP-379 document. PIP-379 is currently in voting. > > > > > > Since there is a goal to include this implementation in the > > > 4.0.0-preview.2 release of 4.0.0, there is urgency to get the review > > > started so that we can complete it by EOD on Monday, October 7th. I > > > hope to receive review feedback as soon as possible so that I will > > > have time to address the review comments. > > > > > > PR for review: https://github.com/apache/pulsar/pull/23352 > > > Please read the PIP-379 document to get the context. The link is in > > > the PR description. > > > > > > Some notable details: > > > > > > 1. The implementation focuses on maintaining the key invariant for > > > Key_Shared subscriptions: "Messages with the same key are delivered > > > and allowed to be in an unacknowledged state to only one consumer at a > > > time." > > > > > > 2. More tests will be added to cover the new logic in classes such as > > > DrainingHashesTracker, HashRanges (utility class), and PendingAcksMap. > > > These tests will be added to the current PR asap. > > > > > > 3. Observability features will be implemented in a subsequent PR after > > > this one is completed. > > > > > > 4. The vote for PIP-379 is currently open at: > > > https://lists.apache.org/thread/z1kgo34qfkkvdnn3l007bdvjr3qqf4rw > > > > > > Important implementation notes: > > > > > > - While working on this solution, I discovered a potential memory leak > > > in the current pending ack implementations. I do not think that the > > > current implementation in Pulsar contains a solution for removing > > > pending acks if the mark delete position moves forward as a result of > > > cursor reset (or any other reason where it moves forward without > > > individual acks). The PIP-379 implementation addresses this issue in > > > the PendingAcksMap.removeAllUpTo method, which gets called from the > > > PersistentDispatcherMultipleConsumers.readMoreEntries method. > > > > > > - Significant improvements have been made to ensure the consistency of > > > pending acks, which is crucial for maintaining the Key_Shared > > > subscription invariant. > > > > > > I kindly request the community to review the first implementation PR: > > > https://github.com/apache/pulsar/pull/23352. Your feedback and > > > suggestions are crucial for ensuring the quality and correctness of > > > this important improvement to Pulsar's Key_Shared subscriptions. > > > > > > Thank you for your time and feedback, > > > > > > Lari > > > > > >