----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72216/#review219933 -----------------------------------------------------------
src/master/validation.cpp Lines 1554-1561 (patched) <https://reviews.apache.org/r/72216/#comment308164> What about the task which does not set `share_cgroups` (or even not set `ContainerInfo` at all)? In that case, I think its `share_cgroups` is also true (since it is true by default), right? And I think this validation will affect the command task case because command task must have `share_cgroups` as true and it is allowed to specify resource limits for it, right? So I think even a task's `share_cgroup` is true, we should still allow it to have resource limits specified if it is the first task of the executor, but we should disallow if it is the second task no matter whether the first task has resource limits specified or not because otherwise the second task's resource limits may not be enforced. But it seems hard to find out whether it is the first task of the executor. So another option could be, we just do not do this validation at all, i.e., a task can always specify a resource limits regardless the value of its `share_cgroups`, so task's resource limits will actually have two differen meanings: 1. If task's `share_cgroups` is false, then the resource limits are its own limits. 2. If task's `share_cgroups` is true, its resource limits will contribute into the executor's resource limits, all tasks of an executor will share the same resource limits which is the sum of all task's resource limits. For #2, actually that is what we are doing in our current implemention, e.g., if a customer executor launches multiple tasks, all these tasks will just share the same resource limits, i.e. the customer executor's resource limits (yes, we indeed always set memory hard limit to memory request and set CPU hard limit to CPU request if `--cgroups_enable_cfs` is true in our current implementation). So #2 sounds not that bad since it is backward compatibile? src/master/validation.cpp Lines 1574-1577 (patched) <https://reviews.apache.org/r/72216/#comment308165> Why do we want to return error if `taskCpus` is none? The purpose of this `validateResourceLimits()` method should be validating resource limits rather than resource requests. src/master/validation.cpp Lines 1582-1587 (patched) <https://reviews.apache.org/r/72216/#comment308166> Ditto. src/master/validation.cpp Lines 1832-1838 (patched) <https://reviews.apache.org/r/72216/#comment308167> The new method `validateResourceLimits` that you introduced has already done this validation and that method will be called for both the tasks launched via `LAUNCH` operation and the tasks launched via `LAUNCH_GROUP` operation, so I think we do not need to do this validation again here. src/master/validation.cpp Lines 2008 (patched) <https://reviews.apache.org/r/72216/#comment308168> I think just checking the tasks launched by this single `LAUNCH_GROUP` operation is not enough, we should also check the existing tasks launched by this executor before. - Qian Zhang On March 12, 2020, 3:10 a.m., Greg Mann wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72216/ > ----------------------------------------------------------- > > (Updated March 12, 2020, 3:10 a.m.) > > > Review request for mesos, Andrei Budnik and Qian Zhang. > > > Bugs: MESOS-10045 > https://issues.apache.org/jira/browse/MESOS-10045 > > > Repository: mesos > > > Description > ------- > > Added master validation for task resource limits and shared cgroups. > > > Diffs > ----- > > src/master/validation.cpp 084f281eadd65cb8ae0a19b7b7797dc71ccebdd2 > > > Diff: https://reviews.apache.org/r/72216/diff/4/ > > > Testing > ------- > > > Thanks, > > Greg Mann > >
