> On Nov. 9, 2017, 1:52 p.m., Jan Schlicht wrote: > > src/slave/slave.hpp > > Lines 538 (patched) > > <https://reviews.apache.org/r/63555/diff/1/?file=1881011#file1881011line538> > > > > What's the motivation for this `extra` parameter? It isn't used > > anywhere, probably we should remove it. > > Chun-Hung Hsiao wrote: > It's for publishing resources for launching new executors, and used in > line 2907 of `slave.cpp`.
Ah, didn't see that, thanks! > On Nov. 9, 2017, 1:52 p.m., Jan Schlicht wrote: > > src/slave/slave.cpp > > Lines 6816-6822 (patched) > > <https://reviews.apache.org/r/63555/diff/1/?file=1881012#file1881012line6816> > > > > So this will try to publish all RP resources of all executor of all > > frameworks? Or am I missing something here? I'd expect that only the RP > > resources of the task/executor that's about to get started should be > > published. Hence `resourceProviderManager->publish(info.id(), > > executor->allocatedResources())` should be enough. > > Chun-Hung Hsiao wrote: > For resources with unique identifiers (such as storage volumes in SLRP), > it is enough to publish resources about to use in the executor. However, for > resources without any identifier, such as the default resources of an agent > (CPUs, memory, disk), since we don't do unpublish, there's no way for > resource providers to know what the allocations are. Image that the resource > provider keeps receiving "publish 2 CPUs", should it keeps publishing 2 new > CPUs every time? This is the motivation to have a "ensure-all" semantics for > `PUBLISH`. Given the semantics, we need to publish all RP resources for all > executors. > > An alternative is that make the resource provider manager to be aware of > executors, then it can keep track of the resources used by each executor, > then compute what the total resourcse should be. But then 1) this is for > agent only so I think this is not appropriate if we want to have the same > manager code in both agent and master; 2) the agent needs to notify the > manager that an executor is finished. Thanks for the explanation! I don't understand though, why a resource provider should be asked to "publish 2 CPUs". In my understanding "publish" is only meant for resource provider resources, thus agent resources should be part of these operations. Or are there use cases where a resource provider might need to know the agent resources of a task? - Jan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/63555/#review190572 ----------------------------------------------------------- On Nov. 4, 2017, 2:55 a.m., Chun-Hung Hsiao wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/63555/ > ----------------------------------------------------------- > > (Updated Nov. 4, 2017, 2:55 a.m.) > > > Review request for mesos, Gilbert Song, Jie Yu, Joseph Wu, and Jan Schlicht. > > > Bugs: MESOS-7550 > https://issues.apache.org/jira/browse/MESOS-7550 > > > Repository: mesos > > > Description > ------- > > `Slave::publishAllocatedResources()` will compute the total allocated > resources for all currently running executor containers, and takes an > `extra` argument for resources that will be used by the executor that > is about to launch, then sums them up and asks the resource provider > manager to publish the resources. > > > Diffs > ----- > > src/slave/slave.hpp df1b0205124555dcb6a0efa5c237f5e77fa2bdf7 > src/slave/slave.cpp 337083dbe60bba2d3773b785bdceeaf0b8fcd070 > > > Diff: https://reviews.apache.org/r/63555/diff/1/ > > > Testing > ------- > > make check > > > Thanks, > > Chun-Hung Hsiao > >
