Re: [DISCUSS] PIP-381: Handle large PositionInfo state

Andrey Yegorov Tue, 24 Sep 2024 17:28:20 -0700

Penghui,

Thank you for the thorough and detailed response.


I have added the feature toggle per Lari's comment on the implementation PR.

Regarding the compatibility, if the user has 10MB cursor data somehow (in
our testng large serialized position info resulted in a single large entry
> 1MB that was rejected by BK), then upgraded and rolled back, the data
will be read the same way as it was previously because the footer with the
chunking info will not be present. This is described in the read path.
In case if the user upgrades, enables the feature and creates the cursor
with chunked PositionInfo the older version won't be able to read the data
after rollback. This is why the feature toggle is added.

I agree that the vast majority of users won't have to deal
with managedLedgerMaxUnackedRangesToPersist in the range of 10th of
millions and above, but there are edge cases when this is needed.


On Tue, Sep 24, 2024 at 3:17 PM PengHui Li <peng...@apache.org> wrote:

> Thanks for driving the proposal.
>
> I would like to share the related context that happened many years ago
>
> - https://lists.apache.org/thread/y0r9kk0968ydpxtf16x6ql3x6kwy7dc1
> - https://lists.apache.org/thread/hfv18cg0yckt5cqd0fc66rp7tth036kf
>
> We have two major approaches:
>
> 1. Minimize the persistent size of cursor data:
> • Example: PR:9292 and cursor data compression, possibly with a compressed
> bitset implementation (RoaringBitmap).
>
> 2. Split the ack cursor data into multiple chunks:
> • Example: PIP-81, PIP-381.
>
> LinLin and I previously worked on PIP-81. Personally, I am not a big fan of
> this solution.
> While working on PIP-81 and cursor data compression, we found that
> compression works well in most cases,
> even when there are millions or tens of millions of ack ranges. I recall we
> shared data on this before, though I can’t seem to find it now.
>
> From a user perspective, most users are satisfied with the current
> solution, and only a few need compression enabled.
> The simplicity of the solution is vital for community users, which was the
> main reason we gave up on PIP-81 earlier.
> Pulsar is already complex, so having a pluggable solution for the long term
> would be more beneficial.
> This way, most users get a clear, simple version, while others needing
> enhanced solutions can create their plugins, managing the complexity
> themselves.
>
> I’m not going to block this proposal, but a few points need clarification:
>
> • Feature Toggle: Add a flag that allows users to enable this feature
> (keeping it disabled by default until there is higher demand).
> Managed ledger and cursor complexities are well-known, so a smooth opt-in
> process is crucial for users to adopt new features gradually.
>
> • Compatibility Concerns: Since the persistent data structure will change,
> we need to address rollback scenarios.
> For instance, if a user has 10MB of cursor data, upgrades to a new version
> with the PIP changes, and then needs to roll back to the older version,
> will that user lose their 10MB cursor data? What steps are required for a
> rollback to ensure data consistency?
>
> Regards,
> Penghui
>
> On Tue, Sep 24, 2024 at 1:42 AM Lari Hotari <lhot...@apache.org> wrote:
>
> > On Tue, 24 Sept 2024 at 05:01, Rajan Dhabalia <rdhaba...@apache.org>
> > wrote:
> > > However, there are multiple other PRs related to key-shared sub, stats,
> > > cursor performance, and other PRs are still blocked by others and
> people
> > > just block it because they think they don't have this usecase. It's so
> > > unfortunate that people easily merge implementations which only handle
> > > small-scale usecases  but the usecases for which Pulsar was built  are
> > > being blocked or take a long time to merge. It's just that I don't have
> > > that energy to keep following up for useful and important changes for
> > > Pulsar. And this is one of these examples as well. I have also started
> > > discussion about improving the PIP process because it has become
> painful
> > in
> > > many cases.
> >
> > It's not that individuals want to block changes for no reason. It
> > seems that the main reason for blocking changes is the fear of
> > regressions. Some areas of the Pulsar codebase aren't well covered in
> > our test suites. For example, we don't have performance tests as part
> > of the Apache Pulsar repositories. We have a lot of tests, but most of
> > them are written in a way that tests the code as the author expects it
> > to work. There are very few tests that evaluate features from the
> > end-user API perspective or as system tests.
> >
> > Writing new tests is slow, and the developer experience is poor with
> > the current test infrastructure. Adding more tests to the main build
> > would slow down Pulsar CI even more. This isn't a new problem; it's
> > been around for many years. I'd love to see more proposals and active
> > contributions to improve the "safety nets" of Apache Pulsar so that we
> > wouldn't fear change. I'm not saying that this is only a testing
> > problem. Testability impacts architecture too. Balancing all different
> > aspects of the system isn't easy, and it requires effort and
> > dedication. We don't currently have enough contributors who are
> > investing their time in enabling others to contribute effectively. I
> > hope that we can improve together and address the problems we have
> > that cause the fear of change. When that is addressed, there would be
> > more confidence in accepting new PIPs and changes even when the
> > reviewer doesn't have the use case or when they aren't familiar with
> > the problem that the PIP is targeting to solve.
> >
> > -Lari
> >
>


-- 
Andrey Yegorov

Re: [DISCUSS] PIP-381: Handle large PositionInfo state

Reply via email to