Re: [DISCUSS] PIP-381: Handle large PositionInfo state

Heesung Sohn Wed, 02 Oct 2024 23:10:36 -0700

> we might be able to solve storing > 10M-100M unack
messages but the question is if a broker really has that many unack
messages then will broker run with such huge memory pressure and will it
really serve large scale usecases for which Pulsar was built? I am sure, it
might be useful for small usecases and clients can use it if needed but it
might not be useful for most of the usecases.


Could we use a shared AckedMessageRanges LRU Cache across many topics?
It seems inefficient to pre-load all ranges.

Thanks,
Heesung



On Wed, Oct 2, 2024 at 11:36 AM Rajan Dhabalia <rdhaba...@apache.org> wrote:
>
> >> Is this the correct understanding of how PR 9292 efficiently stores
> individual acks?
>
> Yes, that's correct. It's like serializing a Position object to a bit which
> could be the smallest serializable size we could achieve among any ser/des
> approaches.
>
> In the past, there were two fundamental challenges we were facing to serve
> a large set of unack messages: (a) in memory pressure/ large GC-Pauses due
> to the large number of Position objects (b) serializing such a number of
> objects to store in bookie ledger for topic recovery.
>
> (a) was handled by #3819 to replace a Position object with a bit which can
> allow brokers to run with a large number of unack messages for a topic. But
> it also comes with a certain limit for large scale multi-tenant systems
> where a broker is serving a large number of topics and serving several
> millions of unack messages per topic can create memory pressure on the
> broker. Therefore, even if we solve (b) to store billions of unack messages
> while topic recovery, the broker might not run with stability beyond sever
> millions of unack messages.
> So, we don't have to solve (b) to store more than 1M-10M unack messages
> because keeping > 10M unack messages can impact broker stability in large
> scale multi-tenant env. #9292 solves this acceptable range of unack
> messages with which we can also run broker with stability.
>
> Talking about PIP-381, we might be able to solve storing > 10M-100M unack
> messages but the question is if a broker really has that many unack
> messages then will broker run with such huge memory pressure and will it
> really serve large scale usecases for which Pulsar was built? I am sure, it
> might be useful for small usecases and clients can use it if needed but it
> might not be useful for most of the usecases.
>
> Thanks,
> Rajan
>
>
> On Tue, Oct 1, 2024 at 11:42 PM Lari Hotari <lhot...@apache.org> wrote:
>
> > On 2024/09/27 19:18:03 Rajan Dhabalia wrote:
> > > Well, again PR#9292 already has an agreement to merge earlier as well and
> > > it was reviewed as well but it just was blocked for no reason. and
> > recently
> > > it was acknowledged that PR#9292 fulfills the purpose of most of the
> > > usecases and we should move forward with the approach. and that was the
> > > reason, I had again spent time rebasing as rebasing efforts were already
> > > mentioned in another thread and therefore we made it ready to make sure
> > we
> > > can use this feature for most of the usecases. I don't see any concern
> > and
> > > issue to not move forward now with PR#9292 unless if anyone comes with a
> > > personal reason. PR#9292 is straightforward and it would be great if
> > > reviewers can review it again and we can make progress to make this
> > feature
> > > available soon.
> >
> > I'm late to reviewing PR 9292, and I reviewed it after it was merged.
> > The change in the merged PR 9292 is very useful if I understand it
> > correctly.
> > Thanks for the great work, Rajan.
> >
> > Since individual acknowledgments get encoded as a long[] array, it will
> > compress the information significantly. A single long entry in the array
> > will hold 64 bits and therefore 64 individual acknowledgments. In theory, 1
> > million (1024 * 1024) individual bits can be held in 128kB of memory
> > (1024kB / 8) since every bit will encode one acknowledgment.
> >
> > Is this the correct understanding of how PR 9292 efficiently stores
> > individual acks?
> >
> > -Lari
> >

Re: [DISCUSS] PIP-381: Handle large PositionInfo state

Reply via email to