[Discuss] Pulsar retention policy

太上玄元道君 Thu, 11 Apr 2024 03:20:48 -0700

Hi, Pulsar community,

I'm opening this thread to discuss the retention policy for managed ledgers.


Currently, the retention policy is defined as a time/size-based policy to
retain messages in the ledger, but there is a difference between the
official documentation and the actual code implementation.

The official documentation states that the retention policy is to retain
the messages that were *acknowledged*. For example, if the retention size
is set to 10GB and there are 20GB of messages acknowledged, Pulsar will
retain 10GB and delete the rest.

However, the actual code implementation is different. It retains the
messages that were *written* to the ledger, including *backlog messages*
and *acknowledged messages*. For instance, if there are 10GB of messages in
the backlog and 10GB of messages were acknowledged:
1. If the retention size is set to 10GB, Pulsar will only retain the 10GB
of messages in the backlog, and the 10GB of messages that were acknowledged
will be deleted.
2. If the retention size is set to 20GB, Pulsar will retain the 10GB of
messages in the backlog and the 10GB of messages that were acknowledged.
3. If the retention size is set to 5GB, Pulsar will retain the 10GB of
messages in the backlog, but the 10GB of messages that were acknowledged
will be deleted.
4. If the retention size is set to 15GB, Pulsar will retain the 10GB of
messages in the backlog and the 5GB of messages that were acknowledged. The
rest of the acknowledged messages will be deleted.

>From Pulsar open source to the present, the code implementation has never
changed, but the meaning of the official documentation has gradually
shifted. So I'm just considering which one is better: the official
documentation or the code implementation? Does the change in the meaning of
the document align more with expectations? Does it indicate that users want
to retain the messages that were acknowledged?

For a long time, users have believed that the Retention Policy is for
retaining messages that were acknowledged. If we change the document to
match the code implementation, will it meet users' expectations?

What should we do? Change the document to match the code implementation or
change the code implementation to match the document?

Regards,
Tao Jiuming

[Discuss] Pulsar retention policy

Reply via email to