> On Sept. 9, 2016, 4:21 a.m., Jie Yu wrote: > > src/slave/containerizer/mesos/provisioner/provisioner.cpp, lines 194-197 > > <https://reviews.apache.org/r/51402/diff/3/?file=1493283#file1493283line194> > > > > I realized that it's not sufficient to just pass in top level orphan > > containers to provisioners/isolators. We also want to know about known > > child containers for both checkpointed containers and orphan containers so > > that provisioners/isolators can cleanup unknown child containers. > > > > Consider the following case: > > 1) containerizer launched a child container A/B under top level > > container A > > 2) isolator prepare finishes for container A/B > > 3) agent crashes before launcher fork is called > > 4) agent recovers > > 5) container A is checkpointed, thus considered alive > > 6) however, provisioners/isolators need to cleanup for container A/B as > > it's unknown to the launcher > > > > Therefore, I suggest we introduce a protobuf 'ContainerRecoverInfo' in > > `include/mesos/slave/containerizer.proto`: > > > > ``` > > message ContainerRecoverInfo { > > repeated ContainerState checkpointed_containers; > > repeated ContainerID orphan_container_ids; // Deprecated. Top level > > orphans. > > repeated COntainerID known_container_ids; // All known containers, > > including child containers. > > } > > ``` > > > > And both Provisioner and Isolator recover interface will take this > > protobuf. > > Gilbert Song wrote: > Thanks for being in details. Sorry did not see this comment yesterday. In > my local implementation, `ContainerRecoverInfo` is only for > isolator::recover(), since we can just change the provisioner::recover > interface with only `knownContainers` set without breaking other parts. This > reduce the complication in provisioner::recover.
Sounds fine to me for now. Eventually, you'll use this protobuf i think. It is just more easy to construct one protobuf and send it to both provisioner and isolators. - Jie ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51402/#review148300 ----------------------------------------------------------- On Sept. 7, 2016, 6:49 p.m., Gilbert Song wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51402/ > ----------------------------------------------------------- > > (Updated Sept. 7, 2016, 6:49 p.m.) > > > Review request for mesos, Benjamin Hindman, Artem Harutyunyan, Jie Yu, Joseph > Wu, and Kevin Klues. > > > Bugs: MESOS-6067 > https://issues.apache.org/jira/browse/MESOS-6067 > > > Repository: mesos > > > Description > ------- > > Added nested container check in provisioner destroy. > > > Diffs > ----- > > src/slave/containerizer/mesos/provisioner/provisioner.cpp > 8e35ff49ec99a242e764095dcfbb541c5e41ec71 > > Diff: https://reviews.apache.org/r/51402/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Gilbert Song > >
