+1

Regards
Jiwei Guo (Tboy)

On Sat, Jan 28, 2023 at 1:28 AM Yubiao Feng
<yubiao.f...@streamnative.io.invalid> wrote:
>
> Hi Michael Marshall
>
> > My primary concern with making this the new default is that the cost of
> calculating the size is in the synchronization on the managed ledger, which
> interrupts writes and reads through that managed ledger and potentially
> others in some cases.
>
> Yes, it affects producing, but it doesn't affect consuming. When
> calculating the backlogSize of each subscription, it fights for an
> exclusive lock(see:
> https://github.com/apache/pulsar/blob/644be5fdd6d080accffd72229b2e21e35a27d722/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L1270),
> this lock is also used when sending messages. If a topic has N
> subscriptions, then the time of impact is multiplied by N. I will write a
> PR to reduce the amplification of influence on producing caused by multiple
> subscriptions.
>
> > we should consider making the JSON result easier to interpret.
>
> Yes, I have set the attribute `backlogSize` in response to be `-1` if
> `--get-subscription-backlog-size` is false to distinguish between the two
> cases:
> - There is no backlog, and the parameter `--get-subscription-backlog-size`
> is true. After changes, the attribute `backlogSize` in response will be `0`.
> - The parameter `--get-subscription-backlog-size` is false. After changes,
> the attribute `backlogSize` in response will be `-1`.
>
> On Sat, Jan 21, 2023 at 3:56 AM Michael Marshall <mmarsh...@apache.org>
> wrote:
>
> > My primary concern with making this the new default is that the cost
> > of calculating the size is in the synchronization on the managed
> > ledger, which interrupts writes and reads through that managed ledger
> > and potentially others in some cases.
> >
> > If the primary motivation for this PR is to avoid confusing users (and
> > I agree that the 0 value confuses users), we should consider making
> > the JSON result easier to interpret. It'd be just as easy to only
> > include the subscription backlog size when a client requests it.
> >
> > What do you think about changing the result?
> >
> > Thanks,
> > Michael
> >
> > On Thu, Jan 19, 2023 at 4:03 AM Haiting Jiang <jianghait...@gmail.com>
> > wrote:
> > >
> > > +1
> > >
> > > Haiting
> > >
> > > On Tue, Jan 17, 2023 at 5:50 PM Yubiao Feng
> > > <yubiao.f...@streamnative.io.invalid> wrote:
> > > >
> > > > Hi Asaf
> > > >
> > > > > It's worth noting that the estimated backlog size of a subscription
> > is
> > > > estimated since it doesn't consider any acknowledged messages between
> > the
> > > > mark delete position and the last message. It simply assumes all
> > messages
> > > > between the mark delete position and the last message have not been
> > > > acknowledged.
> > > >
> > > > Yes, it's not the exact value of the backlog. There are two reasons
> > for the
> > > > loss of accuracy:
> > > > - Whether the Entry size is closer to the `averageSize`.
> > > > - The number of messages after the mark deleted position has been
> > > > acknowledged.
> > > >
> > > > Thanks
> > > > Yubiao Feng
> > > >
> > > > On Tue, Jan 17, 2023 at 3:31 PM Asaf Mesika <asaf.mes...@gmail.com>
> > wrote:
> > > >
> > > > > Small question regarding this:
> > > > >
> > > > > The code for calculation is:
> > > > >
> > > > > long estimateBacklogFromPosition(PositionImpl pos) {
> > > > >     synchronized (this) {
> > > > >         long sizeBeforePosLedger =
> > > > > ledgers.headMap(pos.getLedgerId()).values()
> > > > >                 .stream().mapToLong(LedgerInfo::getSize).sum();
> > > > >         LedgerInfo ledgerInfo = ledgers.get(pos.getLedgerId());
> > > > >         long sizeAfter = getTotalSize() - sizeBeforePosLedger;
> > > > >         if (ledgerInfo == null) {
> > > > >             return sizeAfter;
> > > > >         } else if (pos.getLedgerId() == currentLedger.getId()) {
> > > > >             return sizeAfter - consumedLedgerSize(currentLedgerSize,
> > > > > currentLedgerEntries, pos.getEntryId());
> > > > >         } else {
> > > > >             return sizeAfter -
> > > > > consumedLedgerSize(ledgerInfo.getSize(), ledgerInfo.getEntries(),
> > > > > pos.getEntryId());
> > > > >         }
> > > > >     }
> > > > > }
> > > > >
> > > > > and
> > > > >
> > > > > private long consumedLedgerSize(long ledgerSize, long ledgerEntries,
> > > > > long consumedEntries) {
> > > > >     if (ledgerEntries <= 0) {
> > > > >         return 0;
> > > > >     }
> > > > >     if (ledgerEntries <= (consumedEntries + 1)) {
> > > > >         return ledgerSize;
> > > > >     } else {
> > > > >         long averageSize = ledgerSize / ledgerEntries;
> > > > >         return consumedEntries >= 0 ? (consumedEntries + 1) *
> > averageSize
> > > > > : 0;
> > > > >     }
> > > > > }
> > > > >
> > > > >
> > > > >
> > > > > It's worth noting that the estimated backlog size of a subscription
> > is
> > > > > estimated since it doesn't consider any acknowledged messages
> > between the
> > > > > mark delete position and the last message. It simply assumes all
> > messages
> > > > > between the mark delete position and the last message have not been
> > > > > acknowledged.
> > > > >
> > > > > Good idea - +1
> > > > >
> > > > > On Tue, Jan 17, 2023 at 4:12 AM PengHui Li <codelipeng...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > Penghui
> > > > > >
> > > > > > > On Jan 16, 2023, at 23:36, Yubiao Feng <
> > yubiao.f...@streamnative.io
> > > > > .INVALID>
> > > > > > wrote:
> > > > > > >
> > > > > > > Hi community
> > > > > > >
> > > > > > > I am starting a DISCUSS for making the default value of the
> > parameter
> > > > > > > "--get-subscription-backlog-size" of admin API "topics stats"
> > true.
> > > > > > >
> > > > > > > In the PR https://github.com/apache/pulsar/pull/9302, the
> > property
> > > > > > backlog
> > > > > > > size of each subscription returned in the response of the API
> > topics
> > > > > > stats,
> > > > > > > by default this property is always equal to 0 in response, and
> > this
> > > > > will
> > > > > > > confuse users. Since the calculation of backlog size is done in
> > broker
> > > > > > > memory, there is no significant overhead(the process is
> > described in
> > > > > the
> > > > > > > following section), so I think the correct values should be
> > displayed
> > > > > by
> > > > > > > default.
> > > > > > >
> > > > > > > ### The following two APIs should be affected:
> > > > > > >
> > > > > > > In Pulsar admin API
> > > > > > > ```
> > > > > > > pulsar-admin topics stats persistent://test-tenant/ns1/tp1
> > > > > > > --get-subscription-backlog-size
> > > > > > > pulsar-admin topics stats persistent://test-tenant/ns1/tp1 -sbs
> > > > > > > ```
> > > > > > > the default value of parameter `--get-subscription-backlog-size`
> > will
> > > > > be
> > > > > > > `true`
> > > > > > >
> > > > > > > In Pulsar Rest API
> > > > > > > ```
> > > > > > > curl GET "http://127.0.0.1:8080/test-tenant/ns1/tp1/stats
> > > > > > > "?subscriptionBacklogSize=true
> > > > > > > ```
> > > > > > > the default value of parameter `subscriptionBacklogSize ` will be
> > > > > `true`
> > > > > > >
> > > > > > >
> > > > > > > ### The following is the process of calculating backlog size:
> > > > > > > - Divide `PersistentTopc.ledgers` into two parts according to the
> > > > > > ledgerId
> > > > > > > of the mark delete position of the cursor. The second part is
> > ledgers
> > > > > > > indicating the messages still need to be consumed, aka
> > > > > > backlogSizeInLedgers.
> > > > > > > - Find the LedgerInfo whose ledgerId is the same as the ledgerId
> > of the
> > > > > > > mark delete position of the cursor, and we can also divide the
> > ledger
> > > > > > into
> > > > > > > two parts, the second part is entries indicating the messages
> > still
> > > > > need
> > > > > > to
> > > > > > > be consumed, multiply the average size of each entry in metrics
> > by the
> > > > > > > number of still need to be consumed entries we can get the
> > backlog size
> > > > > > in
> > > > > > > this ledger. aka backlogSizeInEntries.
> > > > > > > - `backlogSizeInLe dgers` + `backlogSizeInEntries`
> > > > > > >
> > > > > > > Thanks
> > > > > > > Yubiao Feng
> > > > > >
> > > > > >
> > > > >
> >

Reply via email to