On Thu, Jan 17, 2019 at 2:58 PM Matteo Merli <mme...@apache.org> wrote:

> After a long delay (no pun intended), I finally got through the
> previous discussions around the delayed message delivery proposals.
> I'm referring to PIP-26
> https://github.com/apache/pulsar/wiki/PIP-26%3A-Delayed-Message-Delivery
> and the Pull Request at #3155
> https://github.com/apache/pulsar/pull/3155
>
> To summarize these proposals (correct me if I'm getting any point wrong):
>
>  * PIP-26
>     - Producer sets arbitrary timeout on each message
>     - Broker keeps a hash-wheel timer (backed by ledger) to keep track
> of messages for which the dispatch has to be deferred
>
>  * PR #3155
>    - Consumer specify a fixed time delay to consume messages
>    - Broker will defer delivery by that time
>
> As I stated previously, we should try to avoid adding complexity in
> the broker dispatching code, unless there's a clear benefit compared
> to do the same operation in client library.
>
> After discussing with Ivan, I wanted to share this alternative approach.
>
> Key points:
>   * Application set arbitrary timeout on each message
>   * Broker is unchanged
>   * Consumer (in client library) will make these messages visible to
> application after delay has expired
>
> Implementation notes:
>   * Producer change is trivial. We just need to add new field in
> message metadata (similar as described in PIP-26)
>   * On consumer side, the following will happen:
>      - Messages get added to receiverQueue
>      - When application calls receive, we might get from the queue a
> message with delay.
>      - This message is not passed to application. Rather insert the
> message ID into a priority queue (or equivalent structure), ordered by
> target time.
>      - At this point messages are not added to the ack-timeout tracker
>      - Periodically, we check the head of the priority queue to see if
> there's anything ready
>         - If so, we request the broker to "redeliver" these messages,
> using the same mechanism as ack-timeout:
> CommandRedeliverUnacknowledgedMessages
>           with a list of message ids)
>

If I understand this correctly, this is an enhanced variation of
"ack-timeout"-ish approach that supports arbitrary delays.

My concern of this category of approaches is "bandwidth" usage. It is
basically trading bandwidth for complexity.

A *deferred* message will be delivered "twice" from brokers (if there is no
cache or cache miss). If a topic's traffic is M bytes/second,
there are N subscriptions. The traffic will effectively be 2 * M * N, which
can potentially be a red flag to the users who
rely on this feature.

But I am also fine with approach if most of people don't in favor of
changing the dispatch logic at broker side.



> This approach will ensure:
>  * We can support arbitrary delays
>  * No changes and no overhead in broker - No need to configure
> policies for delay activation
>  * Works well with existing flow control mechanism: messages are
> dequeued so that we can process messages with smaller delays
>  * Amount of memory required in client side is limited.
>      - We just keep message ids (we could consider caching few
> messages as well, as an optimization)
>      - Broker has a limit of unacked messages pushed to a consumer
> (default 50K). I don't expect this being a particular problem.
>        If there a lot of messages with big differences in the delay
> value, the worst case would be that the applied delay to be higher
>        for some of the messages.
>
> Any thoughts on this?
>
> --
> Matteo Merli
> <mme...@apache.org>
>

Reply via email to