----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72831/#review221819 -----------------------------------------------------------
Ship it! Ship It! - Greg Mann On Sept. 8, 2020, 11:48 p.m., Benjamin Mahler wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72831/ > ----------------------------------------------------------- > > (Updated Sept. 8, 2020, 11:48 p.m.) > > > Review request for mesos and Greg Mann. > > > Bugs: MESOS-9609 > https://issues.apache.org/jira/browse/MESOS-9609 > > > Repository: mesos > > > Description > ------- > > Per MESOS-9609, it's possible for the master to encounter a CHECK > failure during agent removal in the following situation: > > 1. Given a framework with checkpoint == false, with only > executor(s) (no tasks) running on an agent: > 2. When this agent disconects from the master, > Master::removeFramework(Slave*, Framework*) removes the > tasks and executors. However, when there are no tasks, this > function will accidentally insert an entry into > Master::Slave::tasks! (Due to the [] operator usage) > 3. Now if the framework is removed, we have an entry in > Slave::tasks, for which there is no corresponding framework. > 4. When the agent is removed, we have a CHECK failure given > we can't find the framework. > > This fixes the issue by avoiding the accidental insertion. > > > Diffs > ----- > > src/master/master.cpp 02723296e569fac9d553b1494a5ca7daa6ef9aa4 > > > Diff: https://reviews.apache.org/r/72831/diff/1/ > > > Testing > ------- > > See subsequent patch. > > > Thanks, > > Benjamin Mahler > >
