> On Sept. 18, 2016, 11:51 p.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, lines 1794-1800
> > <https://reviews.apache.org/r/51407/diff/6/?file=1501288#file1501288line1794>
> >
> >     Hum, should we do the deletion here if we don't even sure the processes 
> > are killed properly?
> >     
> >     I think my suggestion on RuntimePath for container is:
> >     1) we create the RuntimePath as the first thing in 'launch', even 
> > before we call any provisioner/isolator functions.
> >     2) we checkpoint the pid right after fork
> >     3) we delete the RuntimePath after the destroy is successful.
> >     
> >     The invariants we have if using the above way are:
> >     1) If RuntimePath exists, we know that provisioner/isolator prepare 
> > might be called, so cleanup is necessary during recovery.
> >     2) If RuntimePath does not exist, we know that all cleanups have been 
> > done properly and we no longer need to worry about cleanup.
> >     3) If pid file exists, we know that the process has been forked.
> >     4) If the pid file does not exist, we may or maynot have process being 
> > forked.
> >     
> >     For the upgrade situation, some checkpointed containers or orphans may 
> > not have RuntimePath. In such case, we should not create RuntimePath (i.e., 
> > do not checkpoint pid or exit status) in order to maintain the above 
> > invariant. So for old containers launched by previous version of agent, 
> > there will be no RuntimePath for it at any time (this is another invariant).
> >     
> >     It's likely that a container has RuntimePath, but launcher does not 
> > know about it. This is the case where launcher->destroy has been called 
> > (and being successful), but agent crashes before removing RuntimePath.
> >     
> >     It's also likely that a container does not have RuntimePath, but 
> > launcher knows about it. This is the legacy container case.

Hum, I don't see all the comments here are addressed? Especially the following:

> 1) we create the RuntimePath as the first thing in 'launch', even before we 
> call any provisioner/isolator functions

> For the upgrade situation, some checkpointed containers or orphans may not 
> have RuntimePath. In such case, we should not create RuntimePath (i.e., do 
> not checkpoint pid or exit status) in order to maintain the above invariant.


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51407/#review149406
-----------------------------------------------------------


On Sept. 23, 2016, 9:01 p.m., Kevin Klues wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51407/
> -----------------------------------------------------------
> 
> (Updated Sept. 23, 2016, 9:01 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Bugs: MESOS-6204
>     https://issues.apache.org/jira/browse/MESOS-6204
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This includes checkpointing both the container pid and the status of
> the container upon exit. This also includes an update to tests to
> account for new 'init' process semantics in a container. That is, the
> name of the init process of the container is now "mesos-containerizer"
> not "sh".
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/containerizer.hpp 
> 16f9e3e92e90fe7f8a0ebd24e567800e1f285bc9 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 144b0db501d40d4e0bba12672723616bedd76e7e 
>   src/tests/containerizer/isolator_tests.cpp 
> b4d25e57df7f0e157769c9ae4f7847657c505e78 
> 
> Diff: https://reviews.apache.org/r/51407/diff/
> 
> 
> Testing
> -------
> 
> $ GTEST_FILTER="" make -j check
> $ src/mesos-tests
> $ sudo src/mesos-tests
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>

Reply via email to