----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50841/#review146726 -----------------------------------------------------------
src/slave/containerizer/docker.hpp (lines 505 - 507) <https://reviews.apache.org/r/50841/#comment213320> I would just call this variable `gpus` Also the comment should read: ``` // The number of GPUs allocated to the container. ``` src/slave/containerizer/docker.cpp (line 653) <https://reviews.apache.org/r/50841/#comment213322> I would probably call this variable `count`. When I saw the name `requestedNvidiaGpu` I thought it was a specific GPU id being passed in, not a count. I would also call the function `allocateNvidiaGpus()` since you can allocate more than one with this function. src/slave/containerizer/docker.cpp (lines 662 - 664) <https://reviews.apache.org/r/50841/#comment213326> With type `size_t` you can never have a negative value., so just checking `== 0` should be enough here. src/slave/containerizer/docker.cpp (lines 666 - 668) <https://reviews.apache.org/r/50841/#comment213330> I would do this as the first check in this function. If we don't have an allocator set, then we really shouldn't even be calling this function regardless of anything else that is going on. Also, the string should read: ``` "The `allocateNvidiaGpu` function was called without an `NvidiaGpuAllocator` set". ``` src/slave/containerizer/docker.cpp (line 681) <https://reviews.apache.org/r/50841/#comment213343> I would rename this function `deallocateNvidiaGpus()` (i.e. with an `s` on the end). src/slave/containerizer/docker.cpp (line 684) <https://reviews.apache.org/r/50841/#comment213340> I would move this down just before its first use. src/slave/containerizer/docker.cpp (lines 690 - 692) <https://reviews.apache.org/r/50841/#comment213335> Again, I would move this up to the top of the function, with a similar Failure string as before. src/slave/containerizer/docker.cpp (lines 694 - 696) <https://reviews.apache.org/r/50841/#comment213350> Why do you need this level of indirection? Why not just pass `containers_[containerId]->gpuAllocated` directly to `nvidiaGpuAllocator->deallocate()`? src/slave/containerizer/docker.cpp (lines 698 - 710) <https://reviews.apache.org/r/50841/#comment213348> Why don't you just return from the `deallocate()` call with a `.then()`? I.e. ``` return nvidiaGpuAllocator->deallocate(deallocated) .then(defer(self(), [=](const Nothing& nothing) { containers_[containerId]->gpuAllocated.clear(); return Nothing(); })); ``` If any failures happen in the deallocation, they should get propagated through. src/slave/containerizer/docker.cpp (lines 1376 - 1379) <https://reviews.apache.org/r/50841/#comment213366> I would probably remove the intermediate variable above called `requested` and instead save a temporary `Future` to the `allocateNvidiaGpu()` call. I would also move all of the GPU logic together instead of separating it out. Something like: ``` Future<Nothing> allocateGpus = Nothing(); const Option<TaskInfo>& taskInfo = container->task; if (taskInfo.isNone()) { return Failure("No task information found"); } if (taskInfo->resources.gpus().isSome()) { // Make sure that the `gpus` resource is not fractional. // We rely on scalar resources to only have 3 digits of precision. if (static_cast<long long>(gpus.get() * 1000.0) % 1000 != 0) { return Failure("The 'gpus' resource must be an unsigned integer"); } allocateGpus = allocateNvidiaGpu(gpus.get(), containerId); } ... ... ... return allocateGpus .then(...) .then(...); ``` src/slave/containerizer/docker.cpp (line 1555) <https://reviews.apache.org/r/50841/#comment213368> I wouldn't just blindly call this function here. It should be wrapped in some logic that makes sure it's OK to call it (i.e. checks to make sure that we have the nvidia->allocator component passed in). Again, you could have some logic above which saves a temporary `Future` that is set to `Nothing()` by default and is the result of calling `deallocateNvidiaGpu()` otherwise. src/slave/containerizer/docker.cpp (line 2035) <https://reviews.apache.org/r/50841/#comment213372> Same here, use the temporary mentioned above to do the deallocation or not. - Kevin Klues On Aug. 22, 2016, 10:11 a.m., Yubo Li wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50841/ > ----------------------------------------------------------- > > (Updated Aug. 22, 2016, 10:11 a.m.) > > > Review request for mesos, Benjamin Mahler, Guangya Liu, Kevin Klues, and > Rajat Phull. > > > Bugs: MESOS-5795 > https://issues.apache.org/jira/browse/MESOS-5795 > > > Repository: mesos > > > Description > ------- > > Added control logic to allocate/deallocate GPUs to GPU-related task > when the task is started/terminated. > > > Diffs > ----- > > src/slave/containerizer/docker.hpp f2a06065cf99fed934c2c1ffc47461ec8a97f50d > src/slave/containerizer/docker.cpp 5c1ee8e467d1c54c60b67dc5275ef71e1bb90723 > > Diff: https://reviews.apache.org/r/50841/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Yubo Li > >
