We are working on AURORA-690 to support external service coordinated
job updates. The feature design was proposed in [1] and discussed in
[2].

The one remaining question I would like to discuss here is how to
expose the coordinated update configuration to the user. The
approaches as I see:

1. Expose "blockIfNoPulsesAfterMs" directly in UpdateConfig requiring
user to supply its value to indicate a coordinated update:
    ...
    update_config = UpdateConfig(pulse_interval_secs=60)
    ...
While the most straightforward to implement, it may not deliver on
user's expectations. The external service may be unable to match a
requested job health refresh rate and potentially waste scheduler
performance with unnecessary pulseJobUpdate RPC calls. We may limit
the lower configurable bound for pulse_interval_secs to something sane
to address the latter but it will still not address the unmatched
health refresh rate issue.


2. Expose a flag in UpdateConfig and hardcode a large enough (e.g. 1
minute) interval internally. The Aurora client would then populate
"blockIfNoPulsesAfterMs" to default interval in case the
require_update_pulse flag is set:
...
update_config = UpdateConfig(require_update_pulse=True)
...
This is more user friendly but less flexible in terms of requirement
changes and still does not protect against external service health
refresh rate changes.


3. Do not expose any coordinated update settings in a public schema
and require external service to act as a job update request proxy
mutating job update config on the fly before passing it to the
scheduler.
This is ideal from the external service controlling the health refresh
rate but may require too much hacking as we don't have a private job
config schema and relaying user's identity via an external service is
no fun from security perspective.


Any other options? I am personally leaning towards #1 with hardcoded
min value validation as the simplest solution. Users will be required
to have a knowledge of what refresh rate their health monitoring
system is capable of to configure pulse_interval_secs accordingly.
Thoughts?

Thanks,
Maxim


[1] - 
https://github.com/maxim111333/incubator-aurora/blob/hb_doc/docs/update-heartbeat.md

[2] - 
http://mail-archives.apache.org/mod_mbox/aurora-dev/201410.mbox/%3ccaotkfx7x2oipk4zfysos0uwzrizonkja3y15pvew5k4ynuh...@mail.gmail.com%3E

Reply via email to