Thanks everyone for the feedback.

I actually went through and gave it a shot at implementing this on
https://github.com/apache/pulsar/pull/4062

I think this implementation should address all the concern exposed in
this thread.
Please everyone involved take a deep review of the change.


Thanks,
Matteo

--
Matteo Merli
<matteo.me...@gmail.com>

On Thu, Mar 14, 2019 at 8:10 PM Sijie Guo <guosi...@gmail.com> wrote:
>
> On Fri, Mar 8, 2019 at 11:11 PM Ezequiel Lovelle <ezequiellove...@gmail.com>
> wrote:
>
> > > Seems like we are implementing per message timers.
> >
> > As per pr #3155 <https://github.com/apache/pulsar/pull/3155>, nope. Each
> > message won't have a Timer class per se,
> > just a long field representing its expiration deadline and will be
> > just one, and only one, scheduled task per consumer at any given time.
> >
> > > Seems simpler to just have delay on a topic level.
> >
> > I think complexity would be very similar on both sides (producer/consumer)
> > An important aspect here would be the decision to provide this feature
> > (delay messages on consumer) separately from the producer, hence, the
> > consumer
> > can make the decision to 'delay' all messages regardless of the producer.
> >
> > > if we are able to find a way to plug a new "fixed delay" dispatcher
> > without touching other dispatcher logic, is that a good approach for the
> > community to proceed on this direction?
> >
> > Great question! I like this path.
> >
> > One solution that I think of is something similar of what Mateo did here:
> > https://github.com/apache/pulsar/pull/3615
> >
> > So, we can have a separated class handling consumers with delay extending
> > normal consumer base. The problem with this approach would be in the
> > feature
> > if we want to have consumers with multiple behaviour.
> >
> > e.g. delayed consumer plus some future feature not present right now.
> >
>
>
>
>
> >
> > Anyway, if everyone agrees with Sijie question, we might discuss this on a
> > separated thread.
> >
>
> It seems that there are no objections. So we can probably move forward with
> the idea of having
> a separate dispatch for fixed delayed subscription. This would isolate the
> impacts of modifying existing dispatchers.
>
>
> >
> > --
> > *Ezequiel Lovelle*
> >
> >
> > On Sat, 2 Mar 2019 at 08:45, Ali Ahmed <ahmal...@gmail.com> wrote:
> >
> > > Seems like we are implementing per message timers.
> > >
> > > Not aware of any log pub sub that does that expect rocketmq , not sure
> > how
> > > performant that is.
> > >
> > >
> > https://github.com/apache/rocketmq/blob/2b692c912d18c0f9889fd73358581bcccf37bbbe/store/src/main/java/org/apache/rocketmq/store/schedule/ScheduleMessageService.java
> > >
> > > Seems simpler to just have delay on a topic level.  The cursor for client
> > > subscriptions can make messages available after a delay.
> > > I don't know if we can achieve significant throughput with so many active
> > > timers.
> > >
> > > On Sat, Mar 2, 2019 at 2:49 AM Sijie Guo <guosi...@gmail.com> wrote:
> > >
> > > > I am trying to draw a conclusion on this email thread.
> > > >
> > > > > Maybe some way to plug to the broker some logic without
> > > > interfering with its core?
> > > > >  In our business fixed delay at consumer level regardless of any
> > > producer
> > > > > configuration is a big win due to easy implementation and usage.
> > > >
> > > > Based on Ezequiel's last comment, if we are able to find a way to plug
> > a
> > > > new "fixed delay" dispatcher without touching other dispatcher logic,
> > is
> > > > that a good approach for the community to proceed on this direction?
> > > >
> > > > - Sijie
> > > >
> > > >
> > > > On Wed, Feb 20, 2019 at 8:26 AM 李鹏辉gmail <codelipeng...@gmail.com>
> > > wrote:
> > > >
> > > > > Sorry for hear that DLQ causes GC.
> > > > >
> > > > > Agree with discussed before, Dispatcher is a performance sensitive
> > > piece
> > > > > of code.
> > > > > If we make changes on the dispatcher, we must pay attention to memory
> > > > > overhead and blocking.
> > > > >
> > > > > I prefer fixed delayed message solution(aka delayed time level). User
> > > > > can define multi topics with deferent delay.Topic is still a FIFO
> > > model.
> > > > >
> > > > > Improve user experience by packaging client API, topics can be
> > created
> > > > > automatically, User can customize the delay level.
> > > > >
> > > > > In our scene, This can already meet most of the needs. Currently
> > > depends
> > > > > on DLQ feature. We know from the user where the experience is not
> > very
> > > > > good.
> > > > > User need to maintain the message expired.
> > > > >
> > > > > So, If we can avoid complexity of use and do not impose a performance
> > > > > burden
> > > > > on message dispatching. I prefer implement it on broker side(broker
> > do
> > > > not
> > > > > need to sorting messages by time, just need to check the tail message
> > > > > can be dispatch, i don’t think this will cause dispatching
> > performance
> > > > > problem).
> > > > >
> > > > > For more complicated delayed messages(e.g. arbitrary delayed
> > delivery).
> > > > > I don’t think pulsar need to support such complicated scene(after we
> > > > > discussed before).
> > > > > In our scene, we have more complicated message requirement(e.g. delay
> > > > > message can be
> > > > > paused, stoped, and re-run. e.g. cron messages).
> > > > >
> > > > > However these case is not very widely used.
> > > > >
> > > > > - Penghui
> > > > >
> > > > >
> > > > > > 在 2019年2月20日,06:37,Sebastián Schepens
> > > > > <sebastian.schep...@mercadolibre.com.INVALID> 写道:
> > > > > >
> > > > > > Hi,
> > > > > > I am really not into any details of the proposed implementation,
> > but
> > > > was
> > > > > > just wondering, has anyone had a look at how Uber implemented this
> > in
> > > > > > Cherami? Cherami seems very similar to Pulsar, its storage system
> > > also
> > > > > > seems very similar to bookkeeper. They seem to implement delayed
> > > queues
> > > > > by
> > > > > > storing the time as part of the key in rocksdb and using sorted
> > > > > iterators,
> > > > > > could this be done in Pulsar as well?
> > > > > >
> > > > > > Cheers,
> > > > > > Sebastian
> > > > > >
> > > > > > On Tue, Feb 19, 2019 at 6:02 PM Dave Fisher <dave2w...@comcast.net
> > >
> > > > > wrote:
> > > > > >
> > > > > >> Hi -
> > > > > >>
> > > > > >> Well, it does, but can this be implemented without building a
> > > > > delayQueue?
> > > > > >> It seems to me that a delayQueue both breaks resiliency if the
> > > broker
> > > > > goes
> > > > > >> down and would certainly add overhead. Perhaps my idea to discard
> > > > > responses
> > > > > >> that are too new and then retrieve once they are out of the
> > delayed
> > > > > >> timeframe would be simpler?
> > > > > >>
> > > > > >> Again I am somewhat naive to the details. I’m not sure that the
> > path
> > > > > >> through the code is kept to an absolute minimum when you have a
> > > > Consumer
> > > > > >> with a nonzero delay?
> > > > > >>
> > > > > >> Regards,
> > > > > >> Dave
> > > > > >>
> > > > > >>> On Feb 19, 2019, at 12:39 PM, Ezequiel Lovelle <
> > > > > >> ezequiellove...@gmail.com> wrote:
> > > > > >>>
> > > > > >>> Hi Dave!
> > > > > >>>
> > > > > >>>> I wonder if clients can add an optional argument to the broker
> > > call
> > > > > when
> > > > > >>> pulling events. The argument would be the amount of delay. Any
> > > > messages
> > > > > >>> younger than the delay are not returned by the broker.
> > > > > >>>
> > > > > >>> This is exactly what https://github.com/apache/pulsar/pull/3155
> > > does
> > > > > :).
> > > > > >>> We still need to decide if we want to add this feature at client
> > > side
> > > > > or
> > > > > >>> broker side, the pull request does it on the broker.
> > > > > >>>
> > > > > >>> --
> > > > > >>> *Ezequiel Lovelle*
> > > > > >>>
> > > > > >>>
> > > > > >>> On Tue, 19 Feb 2019 at 17:06, Dave Fisher <dave2w...@comcast.net
> > >
> > > > > wrote:
> > > > > >>>
> > > > > >>>> Hi -
> > > > > >>>>
> > > > > >>>> My thoughts here may be completely useless but I wonder if
> > clients
> > > > can
> > > > > >> add
> > > > > >>>> an optional argument to the broker call when pulling events. The
> > > > > >> argument
> > > > > >>>> would be the amount of delay. Any messages younger than the
> > delay
> > > > are
> > > > > >> not
> > > > > >>>> returned by the broker.
> > > > > >>>>
> > > > > >>>> Regards,
> > > > > >>>> Dave
> > > > > >>>>
> > > > > >>>>> On Feb 19, 2019, at 11:47 AM, Ezequiel Lovelle <
> > > > > >>>> ezequiellove...@gmail.com> wrote:
> > > > > >>>>>
> > > > > >>>>>> The recent changes made to support DLQ caused major problems
> > > with
> > > > > >>>> garbage
> > > > > >>>>> collection
> > > > > >>>>>
> > > > > >>>>> If garbage collection is a big concern maybe we could add some
> > > > config
> > > > > >>>>> parameter on the broker to disable the usage of this feature
> > and
> > > > > return
> > > > > >>>>> BrokerMetadataException in this situation, giving the power to
> > > the
> > > > > >>>>> administrator whether to offer this feature or not.
> > > > > >>>>>
> > > > > >>>>>> is it acceptable to do it at broker side?
> > > > > >>>>>
> > > > > >>>>> I think this is the big question that needs to be answered.
> > > > > >>>>>
> > > > > >>>>>> can we just have a separated dispatcher for fixed delayed
> > > > > >> subscription?
> > > > > >>>>>
> > > > > >>>>> I will try to do a completely new approach, simpler, and more
> > > > > isolated
> > > > > >>>>> from broker logic. Maybe some way to plug to the broker some
> > > logic
> > > > > >>>> without
> > > > > >>>>> interfering with its core?
> > > > > >>>>>
> > > > > >>>>> In our business fixed delay at consumer level regardless of any
> > > > > >> producer
> > > > > >>>>> configuration is a big win due to easy implementation and
> > usage.
> > > > > >>>>>
> > > > > >>>>> --
> > > > > >>>>> *Ezequiel Lovelle*
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> On Wed, 13 Feb 2019 at 23:25, Sijie Guo <guosi...@gmail.com>
> > > > wrote:
> > > > > >>>>>
> > > > > >>>>>> Agreed that dispatcher is a performance sensitive piece of
> > code.
> > > > > Feel
> > > > > >>>> bad
> > > > > >>>>>> to hear that DLQ causes GC. Are there any issues tracking
> > those
> > > > > items
> > > > > >>>> you
> > > > > >>>>>> guys identified with DLQ changes?
> > > > > >>>>>>
> > > > > >>>>>>> How is this different from a subscription running behind?
> > > > > >>>>>>
> > > > > >>>>>> As far as I understand form the discussion at #3155, I don't
> > > think
> > > > > >>>> there is
> > > > > >>>>>> a fundamental difference from a backlogged subscriber.
> > > > > >>>>>> The discussion point will mainly be - if a delayed
> > subscription
> > > > can
> > > > > be
> > > > > >>>>>> implemented with a simpler approach at broker side without
> > > > changing
> > > > > >>>> other
> > > > > >>>>>> dispatcher logic,
> > > > > >>>>>> is it acceptable to do it at broker side? So we don't have to
> > > > > >>>> reimplement
> > > > > >>>>>> the same mechanism at different language clients. I think
> > that's
> > > > the
> > > > > >>>> same
> > > > > >>>>>> tradeoff we were discussing for generic delayed messages.
> > > > > >>>>>>
> > > > > >>>>>> My thought would be - can we just have a separated dispatcher
> > > for
> > > > > >> fixed
> > > > > >>>>>> delayed subscription? The logic can be ISOLATED from other
> > > normal
> > > > > >>>>>> dispatchers. if users don't enable delayed subscription, they
> > > will
> > > > > not
> > > > > >>>>>> exercise that dispatcher. This can be a good direction to
> > > explore
> > > > > for
> > > > > >>>>>> future changes that are related to dispatchers.
> > > > > >>>>>>
> > > > > >>>>>> - Sijie
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>> On Thu, Feb 14, 2019 at 8:43 AM Joe F <joefranc...@gmail.com>
> > > > > wrote:
> > > > > >>>>>>
> > > > > >>>>>>> Delayed subscription is simpler, and probably worth doing in
> > > the
> > > > > >> broker
> > > > > >>>>>> IF
> > > > > >>>>>>> done right.
> > > > > >>>>>>>
> > > > > >>>>>>> How is this different from a subscription running behind?
> > Why
> > > > does
> > > > > >>>>>>> supporting that require this complex a change in the
> > > dispatcher,
> > > > > when
> > > > > >>>> we
> > > > > >>>>>>> already support backlogged subscribers?
> > > > > >>>>>>>
> > > > > >>>>>>> I am extremely wary of changes in the dispatcher. The recent
> > > > > changes
> > > > > >>>> made
> > > > > >>>>>>> to support DLQ caused major problems with garbage collection,
> > > > > broker
> > > > > >>>>>>> failure  and service interruptions for us. Even though we ARE
> > > NOT
> > > > > >> using
> > > > > >>>>>> the
> > > > > >>>>>>> DLQ feature. Not a pleasant experience.
> > > > > >>>>>>>
> > > > > >>>>>>> This is a very performance sensitive piece of code, and it
> > > should
> > > > > be
> > > > > >>>>>>> treated as such.
> > > > > >>>>>>>
> > > > > >>>>>>> Joe
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> On Wed, Feb 13, 2019 at 3:58 PM Sijie Guo <
> > guosi...@gmail.com>
> > > > > >> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>>> Hi all,
> > > > > >>>>>>>>
> > > > > >>>>>>>> I am going to wrap up the discussion regarding delayed
> > > delivery
> > > > > use
> > > > > >>>>>>> cases.
> > > > > >>>>>>>>
> > > > > >>>>>>>> For arbitrary delayed delivery, there are a few +1s to doing
> > > > > PIP-26
> > > > > >> in
> > > > > >>>>>>>> functions. I am assuming that we will go down this path,
> > > unless
> > > > > >> there
> > > > > >>>>>> are
> > > > > >>>>>>>> other proposals.
> > > > > >>>>>>>>
> > > > > >>>>>>>> However there is a use case Lovelle pointed out about "Fixed
> > > > > Delayed
> > > > > >>>>>>>> Message". More specifically it is
> > > > > >>>>>>>> https://github.com/apache/pulsar/pull/3155
> > > > > >>>>>>>> (The caption in #3155 is a bit misleading). IMO it is a
> > > "delayed
> > > > > >>>>>>>> subscription", basically all messages in the subscription is
> > > > > delayed
> > > > > >>>> to
> > > > > >>>>>>>> dispatch in a given time interval. The consensus of this
> > > feature
> > > > > is
> > > > > >>>> not
> > > > > >>>>>>> yet
> > > > > >>>>>>>> achieved. Basically, there will be two approaches for this:
> > > > > >>>>>>>>
> > > > > >>>>>>>> a) DONT treat "fixed delayed message" as a different case.
> > > Just
> > > > > use
> > > > > >>>> the
> > > > > >>>>>>>> same approach as in PIP-26.
> > > > > >>>>>>>> b) treat "fixed delayed message" as a different case, e.g.
> > we
> > > > can
> > > > > >>>>>> better
> > > > > >>>>>>>> call it "delayed subscription" or whatever can distinguish
> > it
> > > > from
> > > > > >>>>>>> general
> > > > > >>>>>>>> arbitrary delayed delivery. Use the approach
> > > proposed/discussed
> > > > in
> > > > > >>>>>> #3155.
> > > > > >>>>>>>>
> > > > > >>>>>>>> I would like the community to discuss this and also come to
> > an
> > > > > >>>>>> agreement.
> > > > > >>>>>>>> So Lovelle can move forward with the approach agreed by the
> > > > > >> community.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Thanks,
> > > > > >>>>>>>> Sijie
> > > > > >>>>>>>>
> > > > > >>>>>>>> On Tue, Jan 29, 2019 at 6:30 AM Ezequiel Lovelle <
> > > > > >>>>>>>> ezequiellove...@gmail.com>
> > > > > >>>>>>>> wrote:
> > > > > >>>>>>>>
> > > > > >>>>>>>>> "I agree, but that is *not what #3155 tries to achieve."
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> This typo made this phrase nonsense, sorry!
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Mon, 28 Jan 2019, 16:44 Ezequiel Lovelle <
> > > > > >>>>>> ezequiellove...@gmail.com
> > > > > >>>>>>>>> wrote:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>>> What exactly is the delayed delivery use case?
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> This is helpful on systems relaying on pulsar for
> > persistent
> > > > > >>>>>>> guarantees
> > > > > >>>>>>>>>> and using it for synchronization or some sort of checks,
> > but
> > > > on
> > > > > >>>>>> such
> > > > > >>>>>>>>>> systems is common to have some overhead committing data on
> > > > > >>>>>> persistent
> > > > > >>>>>>>>>> storage maybe due to buffered mechanism or distributing
> > the
> > > > data
> > > > > >>>>>>> across
> > > > > >>>>>>>>>> the network before being available.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Surely would be more use cases I don't came across right
> > > now.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>> Random insertion and deletion is not what FIFO queues
> > like
> > > > > Pulsar
> > > > > >>>>>>> are
> > > > > >>>>>>>>>> designed for.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> I agree, but that is now what #3155 tries to achieve.
> > #3155
> > > is
> > > > > >>>>>> just a
> > > > > >>>>>>>>>> fixed delay for all message in a consumer, that's the
> > reason
> > > > > that
> > > > > >>>>>> the
> > > > > >>>>>>>>>> implementation of #3155 is quite trivial.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> +1 from me for doing PIP-26 in functions.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> --
> > > > > >>>>>>>>>> *Ezequiel Lovelle*
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> On Sat, 26 Jan 2019 at 09:57, Yuva raj <uvar...@gmail.com
> > >
> > > > > wrote:
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>> Considering the way pulsar is built +1 for doing PIP-26
> > in
> > > > > >>>>>>> functions.
> > > > > >>>>>>>> I
> > > > > >>>>>>>>> am
> > > > > >>>>>>>>>>> more of thinking in a way like publish it pulsar we will
> > > make
> > > > > it
> > > > > >>>>>>>>> available
> > > > > >>>>>>>>>>> in a different queuing system if you need priority and
> > > delay
> > > > > >>>>>>> messages
> > > > > >>>>>>>>>>> support. Pulsar functions would go enough for this kind
> > of
> > > > use
> > > > > >>>>>>> cases.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> On Fri, 25 Jan 2019 at 22:29, Ivan Kelly <
> > iv...@apache.org
> > > >
> > > > > >>>>>> wrote:
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>>> Correct. PIP-26 can be implemented in Functions. I
> > > believe
> > > > > the
> > > > > >>>>>>>> last
> > > > > >>>>>>>>>>>>> discussion in PIP-26 thread kind of agree on functions
> > > > > >>>>>> approach.
> > > > > >>>>>>>>>>>>> If the community is okay with PIP-26 in functions, I
> > > think
> > > > > >>>>>> that
> > > > > >>>>>>> is
> > > > > >>>>>>>>>>>> probably
> > > > > >>>>>>>>>>>>> a good approach to start.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> +1 for doing it in functions.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> -Ivan
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> --
> > > > > >>>>>>>>>>> *Thanks*
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> *Yuvaraj L*
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > -Ali
> > >
> >

Reply via email to