Re: Review Request 72216: Added master validation for task resource limits and shared cgroups.

Qian Zhang Sun, 15 Mar 2020 02:05:21 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72216/#review219933
-----------------------------------------------------------





src/master/validation.cpp
Lines 1554-1561 (patched)
<https://reviews.apache.org/r/72216/#comment308164>

    What about the task which does not set `share_cgroups` (or even not set 
`ContainerInfo` at all)? In that case, I think its `share_cgroups` is also true 
(since it is true by default), right?
    
    And I think this validation will affect the command task case because 
command task must have `share_cgroups` as true and it is allowed to specify 
resource limits for it, right? So I think even a task's `share_cgroup` is true, 
we should still allow it to have resource limits specified if it is the first 
task of the executor, but we should disallow if it is the second task no matter 
whether the first task has resource limits specified or not because otherwise 
the second task's resource limits may not be enforced.
    
    But it seems hard to find out whether it is the first task of the executor. 
So another option could be, we just do not do this validation at all, i.e., a 
task can always specify a resource limits regardless the value of its 
`share_cgroups`, so task's resource limits will actually have two differen 
meanings:
    1. If task's `share_cgroups` is false, then the resource limits are its own 
limits.
    2. If task's `share_cgroups` is true, its resource limits will contribute 
into the executor's resource limits, all tasks of an executor will share the 
same resource limits which is the sum of all task's resource limits.
    
    For #2, actually that is what we are doing in our current implemention, 
e.g., if a customer executor launches multiple tasks, all these tasks will just 
share the same resource limits, i.e. the customer executor's resource limits 
(yes, we indeed always set memory hard limit to memory request and set CPU hard 
limit to CPU request if `--cgroups_enable_cfs` is true in our current 
implementation). So #2 sounds not that bad since it is backward compatibile?



src/master/validation.cpp
Lines 1574-1577 (patched)
<https://reviews.apache.org/r/72216/#comment308165>

    Why do we want to return error if `taskCpus` is none? The purpose of this 
`validateResourceLimits()` method should be validating resource limits rather 
than resource requests.



src/master/validation.cpp
Lines 1582-1587 (patched)
<https://reviews.apache.org/r/72216/#comment308166>

    Ditto.



src/master/validation.cpp
Lines 1832-1838 (patched)
<https://reviews.apache.org/r/72216/#comment308167>

    The new method `validateResourceLimits` that you introduced has already 
done this validation and that method will be called for both the tasks launched 
via `LAUNCH` operation and the tasks launched via `LAUNCH_GROUP` operation, so 
I think we do not need to do this validation again here.



src/master/validation.cpp
Lines 2008 (patched)
<https://reviews.apache.org/r/72216/#comment308168>

    I think just checking the tasks launched by this single `LAUNCH_GROUP` 
operation is not enough, we should also check the existing tasks launched by 
this executor before.


- Qian Zhang


On March 12, 2020, 3:10 a.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72216/
> -----------------------------------------------------------
> 
> (Updated March 12, 2020, 3:10 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Qian Zhang.
> 
> 
> Bugs: MESOS-10045
>     https://issues.apache.org/jira/browse/MESOS-10045
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added master validation for task resource limits and shared cgroups.
> 
> 
> Diffs
> -----
> 
>   src/master/validation.cpp 084f281eadd65cb8ae0a19b7b7797dc71ccebdd2 
> 
> 
> Diff: https://reviews.apache.org/r/72216/diff/4/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Greg Mann
> 
>

Re: Review Request 72216: Added master validation for task resource limits and shared cgroups.

Reply via email to