----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/75074/#review226771 -----------------------------------------------------------
Fix it, then Ship it! src/slave/containerizer/mesos/containerizer.cpp Line 526 (original), 526 (patched) <https://reviews.apache.org/r/75074/#comment315062> nit: preserve space between ] ( src/slave/containerizer/mesos/isolators/gpu/isolator.hpp Lines 40 (patched) <https://reviews.apache.org/r/75074/#comment315061> nit: alphabetize src/slave/containerizer/mesos/isolators/gpu/isolator.cpp Lines 109-118 (patched) <https://reviews.apache.org/r/75074/#comment315063> nit: indentation is off here? src/slave/containerizer/mesos/isolators/gpu/isolator.cpp Lines 160-162 (original), 199-203 (patched) <https://reviews.apache.org/r/75074/#comment315064> this looks wrong? it can call hierarchy.error() when hierarchy isSome() or isNone()? as long as `!(*cgroups2Mounted)` it will call it? did you want && here? src/slave/containerizer/mesos/isolators/gpu/isolator.cpp Lines 308-310 (patched) <https://reviews.apache.org/r/75074/#comment315065> this can be solved btw, but it seems ok to get the entire state anyway src/slave/containerizer/mesos/isolators/gpu/isolator.cpp Lines 372 (patched) <https://reviews.apache.org/r/75074/#comment315067> on the surface this looks wrong, we're passing larger and larger vectors in on each call since we're assigning back to the future variable the `__recover` call is going to copy the vector and append an entry? there's no reason to pass the vector at all then? instead the caller can do: ``` futures.push_back(__recover(containerId, containerGpus)); ``` src/slave/containerizer/mesos/isolators/gpu/isolator.cpp Lines 451-459 (patched) <https://reviews.apache.org/r/75074/#comment315068> we can simplify this by changing controlDeviceEntries to a hashmap or linkedhashmap and calling .values() to get a vector out src/slave/containerizer/mesos/isolators/gpu/isolator.cpp Lines 528-534 (original), 668-674 (patched) <https://reviews.apache.org/r/75074/#comment315070> looks like a helper would shorten this and the same conversion in `_update`? I'll leave as is, but would be good to follow up src/slave/containerizer/mesos/isolators/gpu/isolator.cpp Line 534 (original), 674 (patched) <https://reviews.apache.org/r/75074/#comment315069> do we still want mknod here? I'll leave it but seems worth removing it in v1 too? src/slave/containerizer/mesos/isolators/gpu/isolator.cpp Lines 566-572 (original), 716-722 (patched) <https://reviews.apache.org/r/75074/#comment315071> (see note above, this could be more concise with a helper) - Benjamin Mahler On Aug. 3, 2024, 1:20 a.m., Jason Zhou wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/75074/ > ----------------------------------------------------------- > > (Updated Aug. 3, 2024, 1:20 a.m.) > > > Review request for mesos and Benjamin Mahler. > > > Repository: mesos > > > Description > ------- > > Currently, the GPU isolator assumes we are only using cgroups v1, and > makes use of the cgroups::devices::allow and deny functions to control > GPU access. > > In Cgroups2, we need to attach ebpf programs for the specific cgroups, > which is done for us in the DeviceManager. Hence, we need to use the > DeviceManager in the GPU isolator depending on whether cgroups v1 or v2 > is currently mounted > > > Diffs > ----- > > src/slave/containerizer/mesos/containerizer.cpp > 9bafb137ba67bfed5cfb24c1a3af9203fde9e821 > src/slave/containerizer/mesos/isolators/gpu/isolator.hpp > e4f221d5f63ed1db044acabbbd427a30a0f69ced > src/slave/containerizer/mesos/isolators/gpu/isolator.cpp > 99119f938e2eb5ca6a8b64d073c87ca5032a00b8 > > > Diff: https://reviews.apache.org/r/75074/diff/8/ > > > Testing > ------- > > Existing GPU isolator tests pass, with the exception of the > DefaultExecutorVerifyDeviceAccess test which is because we currently don't > support nested containers. > > > Thanks, > > Jason Zhou > >
