Re: Heartbeat mechanism auditing

2015-02-02 Thread Maxim Khutornenko
Chatted with davmclau, wfarner, kevints and jcohen. The consensus is to move forward with the state-based approach to ease up troubleshooting from day one. Will update the RB unless there are objections to this approach. Brief design update summary: - there will be 2 new job update states: ROLL_FO

Re: Heartbeat mechanism auditing

2015-01-29 Thread David McLaughlin
On Thu, Jan 29, 2015 at 2:45 PM, Maxim Khutornenko wrote: > To add a bit of history to the topic, the current design has been > debated heavily here [1] and an active/lazy consensus was reached > around implementing the first iteration as lightweight as possible > without persisting any durable s

Re: Heartbeat mechanism auditing

2015-01-29 Thread Kevin Sweeney
+1, the implementation tradeoffs were discussed extensively in that thread. Regarding the potential user experience my thought is that http://mail-archives.apache.org/mod_mbox/incubator-aurora-dev/201410.mbox/%3CCAAATh-bA0f4yPAoH8+xrwd=xkzhgqvm8nylle6ihha-hdes...@mail.gmail.com%3E presents an acc

Re: Heartbeat mechanism auditing

2015-01-29 Thread Bill Farner
I'm actually beginning to think that an explicit state for "waiting for a heartbeat" might be easier to implement than volatile state. In a world where job updates are fully automated, i could see a bunch of users asking why a job update made no progress for a period of time, so it's really nice i

Re: Heartbeat mechanism auditing

2015-01-29 Thread Bill Farner
Here's the permalink to the thread in question: http://mail-archives.apache.org/mod_mbox/incubator-aurora-dev/201410.mbox/%3CCAOTkfX7x2oipk4ZFysoS0uWZRizOnKJA3y15pvEW5K4YnUHw-A%40mail.gmail.com%3E -=Bill On Thu, Jan 29, 2015 at 2:45 PM, Maxim Khutornenko wrote: > To add a bit of history to the

Re: Heartbeat mechanism auditing

2015-01-29 Thread Bill Farner
Here's the permalink to the thread in question: http://mail-archives.apache.org/mod_mbox/incubator-aurora-dev/201410.mbox/%3CCAOTkfX7x2oipk4ZFysoS0uWZRizOnKJA3y15pvEW5K4YnUHw-A%40mail.gmail.com%3E -=Bill On Thu, Jan 29, 2015 at 2:45 PM, Maxim Khutornenko wrote: > To add a bit of history to the

Re: Heartbeat mechanism auditing

2015-01-29 Thread Maxim Khutornenko
To add a bit of history to the topic, the current design has been debated heavily here [1] and an active/lazy consensus was reached around implementing the first iteration as lightweight as possible without persisting any durable state. My take on this - we should proceed as originally proposed gi