In theory when the user resumes they'd also resume monitoring (and thus resume heartbeats)? Maybe the resumeJobUpdate RPC needs to support pauseIfNoHeartbeatsAfterMs as well?
I'm not sure what a monitoring/heartbeating service would do with its record of a paused job while it's paused. How would it know when to resume monitoring and sending heartbeats without some notification that the update itself has resumed? By the same token, if monitoring were to continue while an updated was paused and an alert were to fire, what action would the monitoring service take with regard to the paused update? On Fri, Oct 10, 2014 at 1:28 PM, David McLaughlin <da...@dmclaughlin.com> wrote: > - A heartbeatJobUpdate RPC is called with the matching update ID. > Scheduler resets countdown and responds with STOP > > Paused is a tricky state because the user can resume at any time. I'd > propose we have a different response here. You really don't want to "stop" > monitoring the update while it is in a non-terminal state. You might want > to be aware that your heartbeat is a no-op, though. > > On Fri, Oct 10, 2014 at 12:47 PM, Maxim Khutornenko <ma...@apache.org> > wrote: > > > Hi all, > > > > We are proposing a new feature for the scheduler updater, which you > > may find helpful. > > > > I have posed a brief feature summary here: > > > > > https://github.com/maxim111333/incubator-aurora/blob/hb_doc/docs/update-heartbeat.md > > > > Please, reply with your feedback/concerns/comments. > > > > Thanks, > > Maxim > > >