Only enabled `AppendBrokerTimestampMetadataInterceptor` is not enough. We should also improve the TTL: When client publish time > Ledger create time + Ledger rollover time, and `brokerPublishTime` is not set, we can make Ledger TTL time = Ledger create time + Ledger rollover time.
This change can make sure entry expired when client clock is not right. On 2024/01/17 04:31:16 PengHui Li wrote: > > User disabling `AppendBrokerTimestampMetadataInterceptor` does not mean > that they allow this bug. > This is a configuration, not an API. It is difficult to use documentation > to regulate user behavior. > > Actually, they need to know why they want to disable ` > AppendBrokerTimestampMetadataInterceptor`. > Otherwise, why do they want to disable `AppendBrokerTimestampMetadata > Interceptor`? > > Pulsar's default configuration tries to provide a best practice for most of > the cases. To avoid the potential > problem, Pulsar enables `AppendBrokerTimestampMetadataInterceptor` by > default. But it doesn't mean all the cases > should enable `AppendBrokerTimestampMetadataInterceptor`. If users don't > use TTL, the producers are > well-maintained. Is there any reason to have at least 16 bytes append to > each entry? The default configuration > behavior change also needs to be highlighted in the release note. Users > need to know if they don't disable the > `AppendBrokerTimestampMetadataInterceptor` manually on a newer version, the > retained data size will increase. > > BTW, we need to explain each interceptor in the `broker.conf` and why ` > AppendBrokerTimestampMetadataInterceptor` > is enabled by default. If users don't want to read it, change anything in > `broker.conf` they don't really know what it is. > How can they know what they expect? > > Regards, > Penghui > > > On Mon, Jan 15, 2024 at 11:09 PM Lin Lin <lin...@apache.org> wrote: > > > User disabling `AppendBrokerTimestampMetadataInterceptor` does not mean > > that they allow this bug. > > This is a configuration, not an API. It is difficult to use documentation > > to regulate user behavior. > > > > Maybe we can add a new field (BrokerTimestamp) to save the timestamp on > > the Broker side. > > The time priority for trimming Ledger is as follows: > > > > BrokerPublishTime > BrokerTimestamp > > > > If `BrokerPublishTime` exists, `BrokerReceiveTime` is not set. > > If not, we set `BrokerReceiveTime` and is no longer affected by client > > time. > > > > On 2024/01/15 02:15:17 PengHui Li wrote: > > > IMO, we should enable `AppendBrokerTimestampMetadataInterceptor` by > > default. > > > Users can still have a way to disable it if they care about the > > additional > > > metadata stored > > > in each entry. > > > > > > For the `hasBrokerPublishTime` API. The topic might also have historical > > > data without > > > broker publish time. So, it should be fine to keep this API because we > > > don't know how > > > long users will retain their data. > > > > > > Regards, > > > Penghui > > > > > > On Sat, Jan 6, 2024 at 10:35 PM linlin <lin...@apache.org> wrote: > > > > > > > Now, if the message's metadata does not set a broker side timestamp, > > the > > > > ledger expiration check is based on the client's publish time. > > > > > > > > When the client machine's clock is incorrect (eg: set to 1 year later) > > , > > > > the ledger can not be cleaned up. Issue > > > > https://github.com/apache/pulsar/issues/21347 > > > > > > > > `AppendBrokerTimestampMetadataInterceptor` can set timestamp for > > messages > > > > on the broker side, but we can not ensure that the > > > > `AppendBrokerTimestampMetadataInterceptor` is always enable > > > > > > > > Therefore, I open this PR(https://github.com/apache/pulsar/pull/21835) > > to > > > > always set the broker timestamp for messages on the broker side. > > > > > > > > With this change , firstly we should deprecate > > > > AppendBrokerTimestampMetadataInterceptor. > > > > It no longer needs to exist > > > > > > > > Secondly, we should deprecate `hasBrokerPublishTime` in interface > > Message. > > > > It always returns true. > > > > This API is created in PR (https://github.com/apache/pulsar/pull/11553 > > ) > > > > This PR is for the client to obtain BrokerPublishTime, so the > > > > `hasBrokerPublishTime` API is not necessary. > > > > > > > > > >