+1 to the #1. Disabling health checks is like signing a waiver where all health check guarantees are off.
On Fri, Oct 10, 2014 at 2:23 PM, David Pan <david.p...@gmail.com> wrote: > Hi Aurora, > > I am currently working on a feature that allows for health checks to be > disabled temporarily for a running instance of a job. The code review can > be found at https://reviews.apache.org/r/26383/. The idea is that the > presence of a special "snooze file" in the task's sandbox will trigger the > disabling of the health checks. > > Currently, the code reviewers have split off into two camps: > 1. One set of reviewers believe that simplicity is key. Disable the health > checks if the snooze file is present, enable it otherwise. > > 2. The other set of reviewers believe that there should be a snooze > duration. The timer starts when the snooze file is touched. After the > snooze duration is exhausted, the snooze file should be deleted by the > health checker, and health checks resume. This is useful if the process > that initially disabled the health checks dies unexpectedly, and is no > longer there to re-enable the health checks. > > I would like to invite anyone interested to voice your opinions and chime > in. > > Thanks, > > David Pan