bump 太上玄元道君 <dao...@apache.org> 于2024年3月10日周日 06:41写道:
> Hi Pulsar community, > > A new PIP is opened, this thread is to discuss PIP-345: Optimize finding > message by timestamp. > > Motivation: > Finding message by timestamp is widely used in Pulsar: > * It is used by the `pulsar-admin` tool to get the message id by > timestamp, expire messages by timestamp, and reset cursor. > * It is used by the `pulsar-client` to reset the subscription to a > specific timestamp. > * And also used by the `expiry-monitor` to find the messages that are > expired. > Even though the current implementation is correct, and using binary search > to speed-up, but it's still not efficient *enough*. > The current implementation is to scan all the ledgers to find the message > by timestamp. > This is a performance bottleneck, especially for large topics with many > messages. > Say, if there is a topic which has 1m entries, through the binary search, > it will take 20 iterations to find the message. > In some extreme cases, it may lead to a timeout, and the client will not > be able to seeking by timestamp. > > PIP: https://github.com/apache/pulsar/pull/22234 > > Your feedback is very important to us, please take a moment to review the > proposal and provide your thoughts. > > Thanks, > Tao Jiuming >