After a long delay (no pun intended), I finally got through the previous discussions around the delayed message delivery proposals. I'm referring to PIP-26 https://github.com/apache/pulsar/wiki/PIP-26%3A-Delayed-Message-Delivery and the Pull Request at #3155 https://github.com/apache/pulsar/pull/3155
To summarize these proposals (correct me if I'm getting any point wrong): * PIP-26 - Producer sets arbitrary timeout on each message - Broker keeps a hash-wheel timer (backed by ledger) to keep track of messages for which the dispatch has to be deferred * PR #3155 - Consumer specify a fixed time delay to consume messages - Broker will defer delivery by that time As I stated previously, we should try to avoid adding complexity in the broker dispatching code, unless there's a clear benefit compared to do the same operation in client library. After discussing with Ivan, I wanted to share this alternative approach. Key points: * Application set arbitrary timeout on each message * Broker is unchanged * Consumer (in client library) will make these messages visible to application after delay has expired Implementation notes: * Producer change is trivial. We just need to add new field in message metadata (similar as described in PIP-26) * On consumer side, the following will happen: - Messages get added to receiverQueue - When application calls receive, we might get from the queue a message with delay. - This message is not passed to application. Rather insert the message ID into a priority queue (or equivalent structure), ordered by target time. - At this point messages are not added to the ack-timeout tracker - Periodically, we check the head of the priority queue to see if there's anything ready - If so, we request the broker to "redeliver" these messages, using the same mechanism as ack-timeout: CommandRedeliverUnacknowledgedMessages with a list of message ids) This approach will ensure: * We can support arbitrary delays * No changes and no overhead in broker - No need to configure policies for delay activation * Works well with existing flow control mechanism: messages are dequeued so that we can process messages with smaller delays * Amount of memory required in client side is limited. - We just keep message ids (we could consider caching few messages as well, as an optimization) - Broker has a limit of unacked messages pushed to a consumer (default 50K). I don't expect this being a particular problem. If there a lot of messages with big differences in the delay value, the worst case would be that the applied delay to be higher for some of the messages. Any thoughts on this? -- Matteo Merli <mme...@apache.org>