[ https://issues.apache.org/jira/browse/PULSAR-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Penghui Li updated PULSAR-10: ----------------------------- Labels: Pulsar gsoc gsoc2021 mentor (was: gsoc gsoc2021 mentor) > Improve the message backlogs for the topic > ------------------------------------------ > > Key: PULSAR-10 > URL: https://issues.apache.org/jira/browse/PULSAR-10 > Project: Pulsar > Issue Type: Improvement > Reporter: Penghui Li > Priority: Major > Labels: Pulsar, gsoc, gsoc2021, mentor > > In Pulsar, the client usually sends several messages with a batch. From the > broker side, the broker receives a batch and write the batch message to the > storage layer. > The message backlog is maintaining how many messages should be handled for a > subscription. But unfortunately, the current backlog is based on the batches, > not the messages. This will confuse users that they have pushed 1000 messages > to the topic, but from the subscription side, when to check the backlog, will > return a value that lower than 1000 messages such as 100 batches. Not able to > get the message based backlog is it's so expensive to calculate the number of > messages in each batch. > > PIP-70 > [https://github.com/apache/pulsar/wiki/PIP-70%3A-Introduce-lightweight-raw-Message-metadata > > |https://github.com/apache/pulsar/wiki/PIP-70%3A-Introduce-lightweight-raw-Message-metadata]Introduced > a broker level entry metadata which can support message index for a topic(or > message offset of a topic). This will provide the ability to calculate the > number of messages between a message index to another message index. So we > can leverage PIP-70 to improve the message backlog implementation to able to > get the message-based backlog. > > For the Exclusive subscription or Failover subscription, it easy to implement > by calculating the messages between the mark delete position and the LAC > position. But for the Shared and Key_Shared subscription, the individual > acknowledgment will bring some complexity. We can cache the individual > acknowledgment count in the broker memory, so the way to calculate the > message backlog for the Shared and Key_Shared subscription is > `backlogOfTheMarkdeletePosition` - `IndividualAckCount` -- This message was sent by Atlassian Jira (v8.3.4#803005)