I'm cool with #2, specifically if we do not attempt to parse the file and use that to determine the auto-expire time.
-=Bill On Fri, Oct 10, 2014 at 2:48 PM, Joshua Cohen <jco...@twopensource.com> wrote: > I'm in camp #2, I don't feel that it adds a significant amount of > complexity to the health check logic, and it provides a substantial > safeguard against users accidentally shooting themselves in the foot by > accidentally leaving a health check snoozed. > > On Fri, Oct 10, 2014 at 2:32 PM, Maxim Khutornenko <ma...@apache.org> > wrote: > > > +1 to the #1. Disabling health checks is like signing a waiver where > > all health check guarantees are off. > > > > On Fri, Oct 10, 2014 at 2:23 PM, David Pan <david.p...@gmail.com> wrote: > > > Hi Aurora, > > > > > > I am currently working on a feature that allows for health checks to be > > > disabled temporarily for a running instance of a job. The code review > > can > > > be found at https://reviews.apache.org/r/26383/. The idea is that the > > > presence of a special "snooze file" in the task's sandbox will trigger > > the > > > disabling of the health checks. > > > > > > Currently, the code reviewers have split off into two camps: > > > 1. One set of reviewers believe that simplicity is key. Disable the > > health > > > checks if the snooze file is present, enable it otherwise. > > > > > > 2. The other set of reviewers believe that there should be a snooze > > > duration. The timer starts when the snooze file is touched. After the > > > snooze duration is exhausted, the snooze file should be deleted by the > > > health checker, and health checks resume. This is useful if the > process > > > that initially disabled the health checks dies unexpectedly, and is no > > > longer there to re-enable the health checks. > > > > > > I would like to invite anyone interested to voice your opinions and > chime > > > in. > > > > > > Thanks, > > > > > > David Pan > > >