> On Aug. 2, 2024, 11:30 p.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/isolator.cpp > > Lines 291 (patched) > > <https://reviews.apache.org/r/75074/diff/6/?file=2292400#file2292400line291> > > > > nit: always place .then on the next line (look around for example > > formatting)
done > On Aug. 2, 2024, 11:30 p.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/isolator.cpp > > Lines 292 (patched) > > <https://reviews.apache.org/r/75074/diff/6/?file=2292400#file2292400line292> > > > > nit: 4 space indent on paren wrapping done > On Aug. 2, 2024, 11:30 p.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/isolator.cpp > > Lines 335 (patched) > > <https://reviews.apache.org/r/75074/diff/6/?file=2292400#file2292400line335> > > > > nit: remove extra newline done > On Aug. 2, 2024, 11:30 p.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/isolator.cpp > > Lines 337-339 (patched) > > <https://reviews.apache.org/r/75074/diff/6/?file=2292400#file2292400line337> > > > > comment on what this means? and why it's skipped? If the cgroup does not have a recorded state in the DeviceManager, then no gpu would be granted access anyway. So we can skip to the cgroup for the next container state. The above comment has been added in code for posterity. > On Aug. 2, 2024, 11:30 p.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/isolator.cpp > > Lines 340-341 (patched) > > <https://reviews.apache.org/r/75074/diff/6/?file=2292400#file2292400line340> > > > > nit: take a const ref rather than copy? done > On Aug. 2, 2024, 11:30 p.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/isolator.cpp > > Lines 380 (patched) > > <https://reviews.apache.org/r/75074/diff/6/?file=2292400#file2292400line382> > > > > in general if taking a mutable, we would take a pointer, however, why > > not just return these futures here? or inline this before the collect? Didn't want to inline it since the logic for cgroups v1 and v2 are using the same code for pushing into the futures vector. So we can opt to return the futures instead and take in vector<Future<Nothing>>& > On Aug. 2, 2024, 11:30 p.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/isolator.cpp > > Lines 380 (patched) > > <https://reviews.apache.org/r/75074/diff/6/?file=2292400#file2292400line382> > > > > in general if taking a mutable, we would take a pointer, however, why > > not just return these futures here? or inline this before the collect? duplicated > On Aug. 2, 2024, 11:30 p.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/isolator.cpp > > Lines 425-430 (patched) > > <https://reviews.apache.org/r/75074/diff/6/?file=2292400#file2292400line430> > > > > can you just use controlDeviceEntries.entries() inline rather than the > > loop here? > > > > or initialize a vector with .begin() / .end() controlDeviceEntries is std::map. We can initialize a vector using a lambda that makes use of std::transform but it will get quite messy and make the logic a lot less readable. I'm not sure what the entries() function is referring to though, don't think we have something like that in std::map. An example of the code if we were to use it to inline the call to NonWildcardEntry::create(): ``` vector<DeviceManager::NonWildcardEntry> control_device_entries_vector = CHECK_NOTERROR( DeviceManager::NonWildcardEntry::create([&controlDeviceEntries]() { std::vector<cgroups::devices::Entry> controlDeviceEntriesValues; controlDeviceEntriesValues.reserve(controlDeviceEntries.size()); std::transform( controlDeviceEntries.begin(), controlDeviceEntries.end(), std::back_inserter(controlDeviceEntriesValues), [](const std::pair<const int, cgroups::devices::Entry>& pair) { return pair.second; }); return controlDeviceEntriesValues; }())); ``` > On Aug. 2, 2024, 11:30 p.m., Benjamin Mahler wrote: > > src/slave/containerizer/mesos/isolators/gpu/isolator.cpp > > Lines 681-685 (patched) > > <https://reviews.apache.org/r/75074/diff/6/?file=2292400#file2292400line686> > > > > seems like this could be shortened with an asDeviceEntries? added helper fn to use inline for reconfigure() - Jason ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/75074/#review226769 ----------------------------------------------------------- On Aug. 2, 2024, 7:13 p.m., Jason Zhou wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/75074/ > ----------------------------------------------------------- > > (Updated Aug. 2, 2024, 7:13 p.m.) > > > Review request for mesos and Benjamin Mahler. > > > Repository: mesos > > > Description > ------- > > Currently, the GPU isolator assumes we are only using cgroups v1, and > makes use of the cgroups::devices::allow and deny functions to control > GPU access. > > In Cgroups2, we need to attach ebpf programs for the specific cgroups, > which is done for us in the DeviceManager. Hence, we need to use the > DeviceManager in the GPU isolator depending on whether cgroups v1 or v2 > is currently mounted > > > Diffs > ----- > > src/slave/containerizer/mesos/containerizer.cpp > 9bafb137ba67bfed5cfb24c1a3af9203fde9e821 > src/slave/containerizer/mesos/isolators/gpu/isolator.hpp > e4f221d5f63ed1db044acabbbd427a30a0f69ced > src/slave/containerizer/mesos/isolators/gpu/isolator.cpp > 99119f938e2eb5ca6a8b64d073c87ca5032a00b8 > > > Diff: https://reviews.apache.org/r/75074/diff/6/ > > > Testing > ------- > > Existing GPU isolator tests pass, with the exception of the > DefaultExecutorVerifyDeviceAccess test which is because we currently don't > support nested containers. > > > Thanks, > > Jason Zhou > >
