I presume you are referring to case #5?

Joshua correctly pointed to the assumption I made (should have probably
documented it) is that the pause/resume actions would issue relevant
stop/start calls for the monitoring service to either suspend or resume
heartbeats. You may argue it applies unnecessary implementation
restrictions to the monitoring service and I tend to agree.

I'm not sure what a monitoring/heartbeating service would do with its
> record of a paused job while it's paused. How would it know when to resume
> monitoring and sending heartbeats without some notification that the update
> itself has resumed? By the same token, if monitoring were to continue while
> an updated was paused and an alert were to fire, what action would the
> monitoring service take with regard to the paused update?


Perhaps just sending OK (or a NOOP equivalent) in case of a user-paused job
update would make more sense as there is nothing monitoring service could
do in that case. This should work fine with pause/resume -aware/-agnostic
monitoring service implementation.

On Fri, Oct 10, 2014 at 1:43 PM, Joshua Cohen <jco...@twopensource.com>
wrote:

> In theory when the user resumes they'd also resume monitoring (and thus
> resume heartbeats)? Maybe the resumeJobUpdate RPC needs to support
> pauseIfNoHeartbeatsAfterMs as well?
>
> I'm not sure what a monitoring/heartbeating service would do with its
> record of a paused job while it's paused. How would it know when to resume
> monitoring and sending heartbeats without some notification that the update
> itself has resumed? By the same token, if monitoring were to continue while
> an updated was paused and an alert were to fire, what action would the
> monitoring service take with regard to the paused update?
>
> On Fri, Oct 10, 2014 at 1:28 PM, David McLaughlin <da...@dmclaughlin.com>
> wrote:
>
> >    - A heartbeatJobUpdate RPC is called with the matching update ID.
> >    Scheduler resets countdown and responds with STOP
> >
> > Paused is a tricky state because the user can resume at any time. I'd
> > propose we have a different response here. You really don't want to
> "stop"
> > monitoring the update while it is in a non-terminal state. You might want
> > to be aware that your heartbeat is a no-op, though.
> >
> > On Fri, Oct 10, 2014 at 12:47 PM, Maxim Khutornenko <ma...@apache.org>
> > wrote:
> >
> > > Hi all,
> > >
> > > We are proposing a new feature for the scheduler updater, which you
> > > may find helpful.
> > >
> > > I have posed a brief feature summary here:
> > >
> > >
> >
> https://github.com/maxim111333/incubator-aurora/blob/hb_doc/docs/update-heartbeat.md
> > >
> > > Please, reply with your feedback/concerns/comments.
> > >
> > > Thanks,
> > > Maxim
> > >
> >
>

Reply via email to