> We should also improve the TTL:
> When client publish time > Ledger create time + Ledger rollover time,
> and `brokerPublishTime` is not set,
> we can make Ledger TTL time =  Ledger create time + Ledger rollover time.

> This change can make sure entry expires when client clock is not right.

This solution looks good to me.

Regards,
Penghui


On Wed, Jan 17, 2024 at 10:31 PM Yunze Xu <x...@apache.org> wrote:

> From my perspective, it's a serious security issue. The client side
> could easily make Pulsar's message expiry mechanism not work. Even if
> it's not a malicious change made by users, it would be hard to figure
> out what makes the message not expired.
>
> > Users can still have a way to disable it if they care about the
> additional metadata stored in each entry.
>
> I don't think we should provide such an option for users. Compared
> with the serious security issue, the extra overhead (13 bytes) per
> entry (not per message) should never be a concern.
>
> My suggestion is that the broker entry metadata with the broker side
> timestamp should always be generated and we should not allow users to
> disable it. However, `hasBrokerPublishTime` should not be deprecated.
> The broker entry metadata won't be transferred to the client unless
> `exposingBrokerEntryMetadataToClientEnabled` is true.
>
> Thanks,
> Yunze
>
> On Wed, Jan 17, 2024 at 6:45 PM Lin Lin <lin...@apache.org> wrote:
> >
> > Only enabled `AppendBrokerTimestampMetadataInterceptor` is not enough.
> >
> > We should also improve the TTL:
> > When client publish time > Ledger create time + Ledger rollover time,
> > and `brokerPublishTime` is not set,
> > we can make Ledger TTL time =  Ledger create time + Ledger rollover time.
> >
> > This change can make sure entry expired when client clock is not right.
> >
> > On 2024/01/17 04:31:16 PengHui Li wrote:
> > > > User disabling `AppendBrokerTimestampMetadataInterceptor` does not
> mean
> > > that they allow this bug.
> > > This is a configuration, not an API. It is difficult to use
> documentation
> > > to regulate user behavior.
> > >
> > > Actually, they need to know why they want to disable `
> > > AppendBrokerTimestampMetadataInterceptor`.
> > > Otherwise, why do they want to disable `AppendBrokerTimestampMetadata
> > > Interceptor`?
> > >
> > > Pulsar's default configuration tries to provide a best practice for
> most of
> > > the cases. To avoid the potential
> > > problem, Pulsar enables  `AppendBrokerTimestampMetadataInterceptor` by
> > > default. But it doesn't mean all the cases
> > > should enable `AppendBrokerTimestampMetadataInterceptor`. If users
> don't
> > > use TTL, the producers are
> > > well-maintained. Is there any reason to have at least 16 bytes append
> to
> > > each entry? The default configuration
> > > behavior change also needs to be highlighted in the release note. Users
> > > need to know if they don't disable the
> > > `AppendBrokerTimestampMetadataInterceptor` manually on a newer
> version, the
> > > retained data size will increase.
> > >
> > > BTW, we need to explain each interceptor in the `broker.conf` and why `
> > > AppendBrokerTimestampMetadataInterceptor`
> > > is enabled by default. If users don't want to read it, change anything
> in
> > > `broker.conf` they don't really know what it is.
> > > How can they know what they expect?
> > >
> > > Regards,
> > > Penghui
> > >
> > >
> > > On Mon, Jan 15, 2024 at 11:09 PM Lin Lin <lin...@apache.org> wrote:
> > >
> > > > User disabling `AppendBrokerTimestampMetadataInterceptor` does not
> mean
> > > > that they allow this bug.
> > > > This is a configuration, not an API. It is difficult to use
> documentation
> > > > to regulate user behavior.
> > > >
> > > > Maybe we can add a new field (BrokerTimestamp) to save the timestamp
> on
> > > > the Broker side.
> > > > The time priority for trimming Ledger is as follows:
> > > >
> > > > BrokerPublishTime > BrokerTimestamp
> > > >
> > > > If `BrokerPublishTime` exists, `BrokerReceiveTime` is not set.
> > > > If not, we set `BrokerReceiveTime`  and is no longer affected by
> client
> > > > time.
> > > >
> > > > On 2024/01/15 02:15:17 PengHui Li wrote:
> > > > > IMO, we should enable `AppendBrokerTimestampMetadataInterceptor` by
> > > > default.
> > > > > Users can still have a way to disable it if they care about the
> > > > additional
> > > > > metadata stored
> > > > > in each entry.
> > > > >
> > > > > For the `hasBrokerPublishTime` API. The topic might also have
> historical
> > > > > data without
> > > > > broker publish time. So, it should be fine to keep this API
> because we
> > > > > don't know how
> > > > > long users will retain their data.
> > > > >
> > > > > Regards,
> > > > > Penghui
> > > > >
> > > > > On Sat, Jan 6, 2024 at 10:35 PM linlin <lin...@apache.org> wrote:
> > > > >
> > > > > > Now, if the message's metadata does not set a broker side
> timestamp,
> > > > the
> > > > > > ledger expiration check is based on the client's publish time.
> > > > > >
> > > > > > When the client machine's clock is incorrect (eg: set to 1 year
> later)
> > > > ,
> > > > > > the ledger can not be cleaned up. Issue
> > > > > > https://github.com/apache/pulsar/issues/21347
> > > > > >
> > > > > > `AppendBrokerTimestampMetadataInterceptor` can set timestamp for
> > > > messages
> > > > > > on the broker side, but we can not ensure that the
> > > > > > `AppendBrokerTimestampMetadataInterceptor` is always enable
> > > > > >
> > > > > > Therefore, I open this PR(
> https://github.com/apache/pulsar/pull/21835)
> > > > to
> > > > > > always set the broker timestamp for messages on the broker side.
> > > > > >
> > > > > > With this change , firstly we should deprecate
> > > > > > AppendBrokerTimestampMetadataInterceptor.
> > > > > > It no longer needs to exist
> > > > > >
> > > > > > Secondly, we should deprecate `hasBrokerPublishTime` in interface
> > > > Message.
> > > > > > It always returns true.
> > > > > > This API is created in PR (
> https://github.com/apache/pulsar/pull/11553
> > > > )
> > > > > > This PR is for the client to obtain BrokerPublishTime, so the
> > > > > > `hasBrokerPublishTime` API is not necessary.
> > > > > >
> > > > >
> > > >
> > >
>

Reply via email to