+1

Haiting

On Tue, Jan 17, 2023 at 5:50 PM Yubiao Feng
<yubiao.f...@streamnative.io.invalid> wrote:
>
> Hi Asaf
>
> > It's worth noting that the estimated backlog size of a subscription is
> estimated since it doesn't consider any acknowledged messages between the
> mark delete position and the last message. It simply assumes all messages
> between the mark delete position and the last message have not been
> acknowledged.
>
> Yes, it's not the exact value of the backlog. There are two reasons for the
> loss of accuracy:
> - Whether the Entry size is closer to the `averageSize`.
> - The number of messages after the mark deleted position has been
> acknowledged.
>
> Thanks
> Yubiao Feng
>
> On Tue, Jan 17, 2023 at 3:31 PM Asaf Mesika <asaf.mes...@gmail.com> wrote:
>
> > Small question regarding this:
> >
> > The code for calculation is:
> >
> > long estimateBacklogFromPosition(PositionImpl pos) {
> >     synchronized (this) {
> >         long sizeBeforePosLedger =
> > ledgers.headMap(pos.getLedgerId()).values()
> >                 .stream().mapToLong(LedgerInfo::getSize).sum();
> >         LedgerInfo ledgerInfo = ledgers.get(pos.getLedgerId());
> >         long sizeAfter = getTotalSize() - sizeBeforePosLedger;
> >         if (ledgerInfo == null) {
> >             return sizeAfter;
> >         } else if (pos.getLedgerId() == currentLedger.getId()) {
> >             return sizeAfter - consumedLedgerSize(currentLedgerSize,
> > currentLedgerEntries, pos.getEntryId());
> >         } else {
> >             return sizeAfter -
> > consumedLedgerSize(ledgerInfo.getSize(), ledgerInfo.getEntries(),
> > pos.getEntryId());
> >         }
> >     }
> > }
> >
> > and
> >
> > private long consumedLedgerSize(long ledgerSize, long ledgerEntries,
> > long consumedEntries) {
> >     if (ledgerEntries <= 0) {
> >         return 0;
> >     }
> >     if (ledgerEntries <= (consumedEntries + 1)) {
> >         return ledgerSize;
> >     } else {
> >         long averageSize = ledgerSize / ledgerEntries;
> >         return consumedEntries >= 0 ? (consumedEntries + 1) * averageSize
> > : 0;
> >     }
> > }
> >
> >
> >
> > It's worth noting that the estimated backlog size of a subscription is
> > estimated since it doesn't consider any acknowledged messages between the
> > mark delete position and the last message. It simply assumes all messages
> > between the mark delete position and the last message have not been
> > acknowledged.
> >
> > Good idea - +1
> >
> > On Tue, Jan 17, 2023 at 4:12 AM PengHui Li <codelipeng...@gmail.com>
> > wrote:
> >
> > > +1
> > >
> > > Penghui
> > >
> > > > On Jan 16, 2023, at 23:36, Yubiao Feng <yubiao.f...@streamnative.io
> > .INVALID>
> > > wrote:
> > > >
> > > > Hi community
> > > >
> > > > I am starting a DISCUSS for making the default value of the parameter
> > > > "--get-subscription-backlog-size" of admin API "topics stats" true.
> > > >
> > > > In the PR https://github.com/apache/pulsar/pull/9302, the property
> > > backlog
> > > > size of each subscription returned in the response of the API topics
> > > stats,
> > > > by default this property is always equal to 0 in response, and this
> > will
> > > > confuse users. Since the calculation of backlog size is done in broker
> > > > memory, there is no significant overhead(the process is described in
> > the
> > > > following section), so I think the correct values should be displayed
> > by
> > > > default.
> > > >
> > > > ### The following two APIs should be affected:
> > > >
> > > > In Pulsar admin API
> > > > ```
> > > > pulsar-admin topics stats persistent://test-tenant/ns1/tp1
> > > > --get-subscription-backlog-size
> > > > pulsar-admin topics stats persistent://test-tenant/ns1/tp1 -sbs
> > > > ```
> > > > the default value of parameter `--get-subscription-backlog-size` will
> > be
> > > > `true`
> > > >
> > > > In Pulsar Rest API
> > > > ```
> > > > curl GET "http://127.0.0.1:8080/test-tenant/ns1/tp1/stats
> > > > "?subscriptionBacklogSize=true
> > > > ```
> > > > the default value of parameter `subscriptionBacklogSize ` will be
> > `true`
> > > >
> > > >
> > > > ### The following is the process of calculating backlog size:
> > > > - Divide `PersistentTopc.ledgers` into two parts according to the
> > > ledgerId
> > > > of the mark delete position of the cursor. The second part is ledgers
> > > > indicating the messages still need to be consumed, aka
> > > backlogSizeInLedgers.
> > > > - Find the LedgerInfo whose ledgerId is the same as the ledgerId of the
> > > > mark delete position of the cursor, and we can also divide the ledger
> > > into
> > > > two parts, the second part is entries indicating the messages still
> > need
> > > to
> > > > be consumed, multiply the average size of each entry in metrics by the
> > > > number of still need to be consumed entries we can get the backlog size
> > > in
> > > > this ledger. aka backlogSizeInEntries.
> > > > - `backlogSizeInLe dgers` + `backlogSizeInEntries`
> > > >
> > > > Thanks
> > > > Yubiao Feng
> > >
> > >
> >

Reply via email to