Penghui Li created PULSAR-10:
--------------------------------
Summary: Improve the message backlogs
Key: PULSAR-10
URL: https://issues.apache.org/jira/browse/PULSAR-10
Project: Pulsar
Issue Type: Improvement
Reporter: Penghui Li
In Pulsar, the client usually sends several messages with a batch. From the
broker side, the broker receives a batch and write the batch message to the
storage layer.
The message backlog is maintaining how many messages should be handled for a
subscription. But unfortunately, the current backlog is based on the batches,
not the messages. This will confuse users that they have pushed 1000 messages
to the topic, but from the subscription side, when to check the backlog, will
return a value that lower than 1000 messages such as 100 batches. Not able to
get the message based backlog is it's so expensive to calculate the number of
messages in each batch.
PIP-70
[https://github.com/apache/pulsar/wiki/PIP-70%3A-Introduce-lightweight-raw-Message-metadata
|https://github.com/apache/pulsar/wiki/PIP-70%3A-Introduce-lightweight-raw-Message-metadata]Introduced
a broker level entry metadata which can support message index for a topic(or
message offset of a topic). This will provide the ability to calculate the
number of messages between a message index to another message index. So we can
leverage PIP-70 to improve the message backlog implementation to able to get
the message-based backlog.
For the Exclusive subscription or Failover subscription, it easy to implement
by calculating the messages between the mark delete position and the LAC
position. But for the Shared and Key_Shared subscription, the individual
acknowledgment will bring some complexity. We can cache the individual
acknowledgment count in the broker memory, so the way to calculate the message
backlog for the Shared and Key_Shared subscription is
`backlogOfTheMarkdeletePosition` - `IndividualAckCount`
--
This message was sent by Atlassian Jira
(v8.3.4#803005)