Also relevant to this is AURORA-279, which suggests that we may not want to special-case the startup phase.
Additional context - we should lean towards using mesos' framework messages as the communication medium. These messages are one-way, rather than request-response based. This seems to rule out or at least complicate (D). (B) actually sounds interesting to me. The executor could start notifying the scheduler of health check results, triggered by an edge (unhealthy -> healthy, vice versa). -=Bill On Tue, Dec 2, 2014 at 1:53 PM, Nakamura <nny...@gmail.com> wrote: > Howdy, > > I'm interested in tackling AURORA-894, but I'm not terribly familiar with > aurora, so I'd like some feedback on my design before I go forth. > > Bill pointed out that the hard bit would be designing the algorithm so it > doesn't DDoS the scheduler, and I think I have an idea of the possible > design space. I wanted to know what you thought. > > A. sample the number of health checks, and send them back to the > scheduler. this is pretty simple, but 99% of the time will be total noise, > since the data isn't generally useful. > > B. the executor sends health checks until it receives an out of band > request from the scheduler not to. this seems fragile (I'm imagining > mismatched executors/schedulers behaving poorly) but would also probably be > reasonably simple. > > C. a slightly more sophisticated approach might be to tell the executor > how many health checks to look for, so that it could send a status update > back, since status updates have reliable delivery. > > D. when the scheduler has finished standing up the executor, it long-polls, > which also takes care of reliable delivery because it's presumably over TCP > and we have total control (not having to go through mesos). > > I'm hesitant to do A, because it's so wasteful. B sounds fragile, so I > don't want to do that one. D requires long-polling, which your client may > or may not do well. I'm leaning toward C. Do you think that sounds like a > reasonable approach? > > Thanks, > Moses >