> On Oct. 17, 2016, 6:53 a.m., haosdent huang wrote:
> > src/health-check/health_checker.cpp, lines 206-217
> > <https://reviews.apache.org/r/52865/diff/1/?file=1537866#file1537866line206>
> >
> >     After we never stop health check, `consecutiveFailures` may become to 0 
> > after success again. Then `killTask` would transform from `true` to `false` 
> > here. Is it a expected bahaviour?

Very good point, Haosdent.

The problem here is that **one entity decides** when a task should be killed, 
but **another entity enforce** this. The first one cannot really enforce what 
the second does. What is the least surpising behaviour is that unfortunate 
architecture? My opinion is to reset if the second entity, i.e. executor, does 
not comply.

A better architecture would be to separate "health checker" from "unhealthy 
policy enforcer". As we've already agreed, we need a "global" health check 
policy, see [MESOS-6171](https://issues.apache.org/jira/browse/MESOS-6171). 
With two "unhealthy policies", local and global, the health checker library 
should simply report the health status, while the executor will apply one of 
the policies (that may still be implemented in a health checker library for 
code reuse). If you think this makes sense, do you mind filing a ticket about 
this?


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52865/#review152827
-----------------------------------------------------------


On Oct. 14, 2016, 12:37 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52865/
> -----------------------------------------------------------
> 
> (Updated Oct. 14, 2016, 12:37 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar, Benjamin Mahler, Gastón Kleiman, 
> and haosdent huang.
> 
> 
> Bugs: MESOS-5963
>     https://issues.apache.org/jira/browse/MESOS-5963
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Prior to this patch, HealthChecker would stop performing health
> checks after it marks the task for kill. Since tasks' lifecycle
> is managed by scheduler-executor, HealthChecker should never stop
> health checking on its own.
> 
> 
> Diffs
> -----
> 
>   src/docker/executor.cpp ab3f0473fdc9105d1c425f0dbe7b81c566d541e8 
>   src/health-check/health_checker.hpp 
> 392b4d5bd1e5831994b9366c1eb5a2911e19860f 
>   src/health-check/health_checker.cpp 
> 96ae1a733ff3d211b84d0893b4603873af1c89f0 
>   src/launcher/default_executor.cpp af4a97f7de5f2157aa65fdab742455b0683c40a4 
>   src/launcher/executor.cpp 3e95d6029bea0ce6e0dfb39c24b795fe98d90d13 
>   src/tests/health_check_tests.cpp 1d1676d7259bf52cfb1e499954fa815fe7e37522 
> 
> Diff: https://reviews.apache.org/r/52865/diff/
> 
> 
> Testing
> -------
> 
> See https://reviews.apache.org/r/52873/.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>

Reply via email to