>From my perspective, it's a serious security issue. The client side
could easily make Pulsar's message expiry mechanism not work. Even if
it's not a malicious change made by users, it would be hard to figure
out what makes the message not expired.

> Users can still have a way to disable it if they care about the additional 
> metadata stored in each entry.

I don't think we should provide such an option for users. Compared
with the serious security issue, the extra overhead (13 bytes) per
entry (not per message) should never be a concern.

My suggestion is that the broker entry metadata with the broker side
timestamp should always be generated and we should not allow users to
disable it. However, `hasBrokerPublishTime` should not be deprecated.
The broker entry metadata won't be transferred to the client unless
`exposingBrokerEntryMetadataToClientEnabled` is true.

Thanks,
Yunze

On Wed, Jan 17, 2024 at 6:45 PM Lin Lin <lin...@apache.org> wrote:
>
> Only enabled `AppendBrokerTimestampMetadataInterceptor` is not enough.
>
> We should also improve the TTL:
> When client publish time > Ledger create time + Ledger rollover time,
> and `brokerPublishTime` is not set,
> we can make Ledger TTL time =  Ledger create time + Ledger rollover time.
>
> This change can make sure entry expired when client clock is not right.
>
> On 2024/01/17 04:31:16 PengHui Li wrote:
> > > User disabling `AppendBrokerTimestampMetadataInterceptor` does not mean
> > that they allow this bug.
> > This is a configuration, not an API. It is difficult to use documentation
> > to regulate user behavior.
> >
> > Actually, they need to know why they want to disable `
> > AppendBrokerTimestampMetadataInterceptor`.
> > Otherwise, why do they want to disable `AppendBrokerTimestampMetadata
> > Interceptor`?
> >
> > Pulsar's default configuration tries to provide a best practice for most of
> > the cases. To avoid the potential
> > problem, Pulsar enables  `AppendBrokerTimestampMetadataInterceptor` by
> > default. But it doesn't mean all the cases
> > should enable `AppendBrokerTimestampMetadataInterceptor`. If users don't
> > use TTL, the producers are
> > well-maintained. Is there any reason to have at least 16 bytes append to
> > each entry? The default configuration
> > behavior change also needs to be highlighted in the release note. Users
> > need to know if they don't disable the
> > `AppendBrokerTimestampMetadataInterceptor` manually on a newer version, the
> > retained data size will increase.
> >
> > BTW, we need to explain each interceptor in the `broker.conf` and why `
> > AppendBrokerTimestampMetadataInterceptor`
> > is enabled by default. If users don't want to read it, change anything in
> > `broker.conf` they don't really know what it is.
> > How can they know what they expect?
> >
> > Regards,
> > Penghui
> >
> >
> > On Mon, Jan 15, 2024 at 11:09 PM Lin Lin <lin...@apache.org> wrote:
> >
> > > User disabling `AppendBrokerTimestampMetadataInterceptor` does not mean
> > > that they allow this bug.
> > > This is a configuration, not an API. It is difficult to use documentation
> > > to regulate user behavior.
> > >
> > > Maybe we can add a new field (BrokerTimestamp) to save the timestamp on
> > > the Broker side.
> > > The time priority for trimming Ledger is as follows:
> > >
> > > BrokerPublishTime > BrokerTimestamp
> > >
> > > If `BrokerPublishTime` exists, `BrokerReceiveTime` is not set.
> > > If not, we set `BrokerReceiveTime`  and is no longer affected by client
> > > time.
> > >
> > > On 2024/01/15 02:15:17 PengHui Li wrote:
> > > > IMO, we should enable `AppendBrokerTimestampMetadataInterceptor` by
> > > default.
> > > > Users can still have a way to disable it if they care about the
> > > additional
> > > > metadata stored
> > > > in each entry.
> > > >
> > > > For the `hasBrokerPublishTime` API. The topic might also have historical
> > > > data without
> > > > broker publish time. So, it should be fine to keep this API because we
> > > > don't know how
> > > > long users will retain their data.
> > > >
> > > > Regards,
> > > > Penghui
> > > >
> > > > On Sat, Jan 6, 2024 at 10:35 PM linlin <lin...@apache.org> wrote:
> > > >
> > > > > Now, if the message's metadata does not set a broker side timestamp,
> > > the
> > > > > ledger expiration check is based on the client's publish time.
> > > > >
> > > > > When the client machine's clock is incorrect (eg: set to 1 year later)
> > > ,
> > > > > the ledger can not be cleaned up. Issue
> > > > > https://github.com/apache/pulsar/issues/21347
> > > > >
> > > > > `AppendBrokerTimestampMetadataInterceptor` can set timestamp for
> > > messages
> > > > > on the broker side, but we can not ensure that the
> > > > > `AppendBrokerTimestampMetadataInterceptor` is always enable
> > > > >
> > > > > Therefore, I open this PR(https://github.com/apache/pulsar/pull/21835)
> > > to
> > > > > always set the broker timestamp for messages on the broker side.
> > > > >
> > > > > With this change , firstly we should deprecate
> > > > > AppendBrokerTimestampMetadataInterceptor.
> > > > > It no longer needs to exist
> > > > >
> > > > > Secondly, we should deprecate `hasBrokerPublishTime` in interface
> > > Message.
> > > > > It always returns true.
> > > > > This API is created in PR (https://github.com/apache/pulsar/pull/11553
> > > )
> > > > > This PR is for the client to obtain BrokerPublishTime, so the
> > > > > `hasBrokerPublishTime` API is not necessary.
> > > > >
> > > >
> > >
> >

Reply via email to