Another option is to just add a process to your task that looks something like Process(name = 'timeout', cmdline = 'sleep 3600; false'). If the task runs for 3600 seconds, then that process will exit with a failure causing the whole task to fail. The only issue I can think of is that it will also cause your tasks to run for 3600 seconds even if the main process succeeds. You may get around this by setting the process ephemeral=True bit (though I'm not sure if an ephemeral process failure will cause the whole task to fail -- this is something I can double check when I'm in front of a computer with thermos installed.)
On Tue, Feb 24, 2015 at 11:45 PM, Joseph Smith <yasumo...@gmail.com> wrote: > Very good question.. to my knowledge there is not a ‘time’ constraint. > > However, you could implement this in a few ways. One of my first thoughts > is to setup a custom StatusChecker < > https://github.com/apache/incubator-aurora/blob/e6e7e53d92b52d78960824022bef8a0546002180/src/main/python/apache/aurora/executor/common/status_checker.py#L68> > which checks the length of a task's runtime. StatusCheckers can return an > ExitState < > https://github.com/apache/incubator-aurora/blob/e6e7e53d92b52d78960824022bef8a0546002180/src/main/python/apache/aurora/executor/common/status_checker.py#L27> > which can end a task. FAILED will allow a Service() to be restarted, but > KILLED should (if I’m following right) actually prevent that from being > rescheduled unless a user manually reschedules it, which may or may not be > what you’re looking for. > > An example of this is the HealthChecker < > https://github.com/apache/incubator-aurora/blob/467bc56049cc775eaf61520a464b363d44023024/src/main/python/apache/aurora/executor/common/health_checker.py>, > which causes a task to go into ‘FAILED’ if it does not pass a specified > health check. > > Please let me know if that makes sense! > Joe > > > On Feb 24, 2015, at 19:11, Yuan <yuans4y...@gmail.com> wrote: > > > > Hello, > > > > In apache aurora, there are resource isolations and sizings on CPU, > > memory and disk space, which can be specified in the job configuration > > file. Is there any similar way to put a constraint on job running time, > > like killing a job if it has been running for more than a certain amount > of > > time? > > > > Thanks, > > Yuan > >