> On Aug. 7, 2024, 12:44 a.m., Benjamin Mahler wrote:
> > src/slave/containerizer/device_manager/device_manager.cpp
> > Lines 241-244 (patched)
> > <https://reviews.apache.org/r/75145/diff/2/?file=2292495#file2292495line241>
> >
> >     isn't this already covered by the Result being none?

No. if the file does not exist, protobuf::read will return an error when it 
tried to get a file descriptor. We don't really want to return Error if the 
file does not exist, so we handle this case separately.


> On Aug. 7, 2024, 12:44 a.m., Benjamin Mahler wrote:
> > src/slave/containerizer/device_manager/device_manager.cpp
> > Lines 321-324 (patched)
> > <https://reviews.apache.org/r/75145/diff/2/?file=2292495#file2292495line321>
> >
> >     can you not leverage the commit function that was already added?
> >     
> >     if you don't want to re-checkpoint, then maybe we should have the 
> > caller decide when to:
> >     
> >     checkpoint()
> >     reconfigureEbfPrograms()
> >     
> >     and here we only do the latter?

Sure, we can add arguments on the commit_device_access_changes function for the 
caller to determine if they want to checkpoint, reconfigure ebpf programs, or 
both?


> On Aug. 7, 2024, 12:44 a.m., Benjamin Mahler wrote:
> > src/slave/containerizer/device_manager/device_manager.cpp
> > Lines 344-349 (patched)
> > <https://reviews.apache.org/r/75145/diff/2/?file=2292495#file2292495line344>
> >
> >     hm.. this case is really a complete failure?

I suppose we can just log warnings here and return Nothing() afterwards... 
similar to how we return Nothing if the state file was not found?


> On Aug. 7, 2024, 12:44 a.m., Benjamin Mahler wrote:
> > src/slave/containerizer/device_manager/device_manager.cpp
> > Line 232 (original), 360 (patched)
> > <https://reviews.apache.org/r/75145/diff/2/?file=2292495#file2292495line360>
> >
> >     put this in the earlier patch?

good callout, somehow this slipped through. moved to prior patch


- Jason


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/75145/#review226779
-----------------------------------------------------------


On Aug. 6, 2024, 8:11 p.m., Jason Zhou wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/75145/
> -----------------------------------------------------------
> 
> (Updated Aug. 6, 2024, 8:11 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> We currently do not have any method of recovering the device access
> states when the cgroups 2 isolator is atempting to recover containers.
> 
> We add a recovery state here that makes use of the protobuf checkpoint
> files to ensure that the previous device accesses of cgroups can be
> restored. It will be used by the cgroups 2 isolator.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/device_manager/device_manager.hpp 
> 853350f70e12b992ef311a35c509a5dce8f2301a 
>   src/slave/containerizer/device_manager/device_manager.cpp 
> e613323dc47a7980984426d37b6fc5cfc52dffe0 
>   src/tests/device_manager_tests.cpp c4e9b8c58282b8d57b5ce88fefddd34c8ea30c77 
> 
> 
> Diff: https://reviews.apache.org/r/75145/diff/2/
> 
> 
> Testing
> -------
> 
> Test added to check the functionality of recover() with container state. 
> Tests pass, and the state was recovered.
> 
> 
> Thanks,
> 
> Jason Zhou
> 
>

Reply via email to