> On Sept. 23, 2016, 2:40 a.m., Guangya Liu wrote: > > It is really weired that the performance of > > `SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7` > > does not improve much when calling `addSlave`, need check more for why > > `addSlave` was same? Without fix, the `addSlave` will call `allocate` for > > each agent, but with the fix, only one `allocate` will be called.... > > > > ``` > > without fix: > > [==========] Running 1 test from 1 test case. > > [----------] Global test environment set-up. > > [----------] 1 test from > > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test > > [ RUN ] > > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7 > > Using 1000 agents and 6000 frameworks > > Added 6000 frameworks in 122268us > > Added 1000 agents in 42.037104secs > > > > With fix: > > [==========] Running 1 test from 1 test case. > > [----------] Global test environment set-up. > > [----------] 1 test from > > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test > > [ RUN ] > > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7 > > Using 1000 agents and 6000 frameworks > > Added 6000 frameworks in 116107us > > Added 1000 agents in 41.615396secs > > ``` > > Guangya Liu wrote: > Jacob, I did more test with the code on Aug 23, at which I posted some > result in this RR, and found that the test result is different, I did > following to get Aug 23 code. > > ``` > LiuGuangyas-MacBook-Pro:build gyliu$ git checkout > 2f78a440ef4201c5b11fb92c225694e84a60369c > > LiuGuangyas-MacBook-Pro:build gyliu$ git log -1 > commit 2f78a440ef4201c5b11fb92c225694e84a60369c > Author: Gilbert Song <[email protected]> > Date: Mon Aug 22 13:00:58 2016 -0700 > > Fixed potential flakiness in ROOT_RecoverOrphanedPersistentVolume. > > Review: https://reviews.apache.org/r/51271/ > ``` > > The test result seems still same as now (without your patch and the code > is get from Aug 23): > > ``` > [==========] Running 1 test from 1 test case. > [----------] Global test environment set-up. > [----------] 1 test from > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test > [ RUN ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7 > Using 1000 agents and 6000 frameworks > Added 6000 frameworks in 144272us > Added 1000 agents in 43.107001secs > ``` > > But anyway, I think that we need find out why the performance for > `addSlave` was not improved based on your patch. > > Jacob Janco wrote: > Yes agreed, per our Slack discussions, I'll look into this. Thanks for > posting the followup. > > Benjamin Mahler wrote: > `addSlave()` is asynchronous and we do not wait for all of the > `addSlave()` futures to complete, so any speedup in `addSlave()` will only > affect the next caller that waits for a result from the allocator. > > Benjamin Mahler wrote: > Ah I missed that we do a `Clock::settle()`, nevermind :) > > Guangya Liu wrote: > Some thinking for why `addSlave` does not improve much... > > Without Jacob's patch, the logic woule be: > > ``` > addSlave -> allocate the single slave > addSlave -> allocate the single slave > addSlave -> allocate the single slave > ... > addSlave -> allocate the single slave > ``` > > With Jacob's patch, the logic would be: > > ``` > addSlave > addSlave > addSlave > ... > addSlave - > allocate for **all** of the slaves > ``` > > The time elapsed by `allocate a single slave N times` with `allocate N > slaves in one allocate` request should not different much, the only > difference is one is looping the event queue while another is looping in > allocator, that's why there are not enough performance change for this. > > But this will impact a lot when adding frameworks or some other events in > allocator which will call `allocate(slaves)`, one proposal is we may need to > add some new benchmark test cases which do the following logic, the following > logic will trigger each `addframework` operation call `allocate(slaves)` > without Jacob's patch, but will only call `allocate(slaves)` one time with > Jacob's patch. > > ``` > 1) Add slaves first > 2) Add frameworks > ``` > > We may get some performance improvement with above case. > > Currently, all of the benchmark test are using > > ``` > 1) Add frameworks > 2) Add agents > ``` > > That's why not much performance improvement... > > Jacob Janco wrote: > This makes sense Guangya, I'm in the process of creating a minimal > benchmark adding a set of slaves then adding frameworks. I'll post here if > the results are interesting. > > Guangya Liu wrote: > Jacob, just FYI, your fix does help a lot for the case I listed above! > > A new simple benchmark test. > > ``` > TEST_P(HierarchicalAllocator_BENCHMARK_Test, AddSlaveAndFrameworks) > { > size_t slaveCount = std::tr1::get<0>(GetParam()); > size_t frameworkCount = std::tr1::get<1>(GetParam()); > > // Pause the clock because we want to manually drive the allocations. > Clock::pause(); > > struct OfferedResources { > FrameworkID frameworkId; > SlaveID slaveId; > Resources resources; > }; > > vector<OfferedResources> offers; > > auto offerCallback = [&offers]( > const FrameworkID& frameworkId, > const hashmap<SlaveID, Resources>& resources_) > { > foreach (auto resources, resources_) { > offers.push_back( > OfferedResources{frameworkId, resources.first, > resources.second}); > } > }; > > cout << "Using " << slaveCount << " agents and " > << frameworkCount << " frameworks" << endl; > > vector<SlaveInfo> slaves; > slaves.reserve(slaveCount); > > vector<FrameworkInfo> frameworks; > frameworks.reserve(frameworkCount); > > initialize(master::Flags(), offerCallback); > > Stopwatch watch; > > const Resources agentResources = Resources::parse( > "cpus:24;mem:4096;disk:4096;ports:[31000-32000]").get(); > > // Each agent has a portion of it's resources allocated to a single > // framework. We round-robin through the frameworks when allocating. > Resources allocation = > Resources::parse("cpus:16;mem:2014;disk:1024").get(); > > Try<::mesos::Value::Ranges> ranges = fragment(createRange(31000, > 32000), 16); > ASSERT_SOME(ranges); > ASSERT_EQ(16, ranges->range_size()); > > allocation += createPorts(ranges.get()); > > watch.start(); > > for (size_t i = 0; i < slaveCount; i++) { > slaves.push_back(createSlaveInfo(agentResources)); > > // Add some used resources on each slave. Let's say there are 16 > tasks, each > // is allocated 1 cpu and a random port from the port range. > hashmap<FrameworkID, Resources> used; > used[frameworks[i % frameworkCount].id()] = allocation; > allocator->addSlave( > slaves[i].id(), slaves[i], None(), slaves[i].resources(), used); > } > > // Wait for all the `addSlave` operations to be processed. > Clock::settle(); > > watch.stop(); > > cout << "Added " << slaveCount << " agents in " > << watch.elapsed() << endl; > > watch.start(); > > for (size_t i = 0; i < frameworkCount; i++) { > frameworks.push_back(createFrameworkInfo("*")); > allocator->addFramework(frameworks[i].id(), frameworks[i], {}); > } > > // Wait for all the `addFramework` operations to be processed. > Clock::settle(); > > watch.stop(); > > cout << "Added " << frameworkCount << " frameworks in " > << watch.elapsed() << endl; > } > ``` > > With your fix: > > ``` > ./bin/mesos-tests.sh --benchmark > --gtest_filter="SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddSlaveAndFrameworks/7" > [==========] Running 1 test from 1 test case. > [----------] Global test environment set-up. > [----------] 1 test from > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test > [ RUN ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddSlaveAndFrameworks/7 > Using 1000 agents and 6000 frameworks > Added 1000 agents in 141352us > Added 6000 frameworks in 12.710913secs > [ OK ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddSlaveAndFrameworks/7 > (12902 ms) > [----------] 1 test from > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (12902 ms total) > > [----------] Global test environment tear-down > [==========] 1 test from 1 test case ran. (12912 ms total) > [ PASSED ] 1 test. > ``` > > Without your fix, I did not wait for the test run finished, as I waited > for almost 10 minutes and the test is not finished....
Beat me to it - thanks for the verification! - Jacob ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51027/#review150123 ----------------------------------------------------------- On Sept. 23, 2016, 4:32 p.m., Jacob Janco wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51027/ > ----------------------------------------------------------- > > (Updated Sept. 23, 2016, 4:32 p.m.) > > > Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus > Ma, and Jiang Yan Xu. > > > Bugs: MESOS-3157 > https://issues.apache.org/jira/browse/MESOS-3157 > > > Repository: mesos > > > Description > ------- > > - Triggered allocations dispatch allocate() only > if there is no pending allocation in the queue. > - Allocation candidates are accumulated and only > cleared when enqueued allocations are processed. > > > Diffs > ----- > > src/master/allocator/mesos/hierarchical.hpp > 2c31471ee0f5d6836393bf87ff9ecfd8df835013 > src/master/allocator/mesos/hierarchical.cpp > 2d56bd011f2c87c67a02d0ae467a4a537d36867e > > Diff: https://reviews.apache.org/r/51027/diff/ > > > Testing > ------- > > make check > > note: check without filters depends on https://reviews.apache.org/r/51028 > > With new benchmark https://reviews.apache.org/r/49617: > Sample output without 51027: > [ RUN ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22 > Using 10000 agents and 3000 frameworks > Added 3000 frameworks in 57251us > Added 10000 agents in 3.21345353333333mins > allocator settled after 1.61236038333333mins > [ OK ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22 > (290578 ms) > > Sample output with 51027: > [ RUN ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22 > Using 10000 agents and 3000 frameworks > Added 3000 frameworks in 39817us > Added 10000 agents in 3.22860541666667mins > allocator settled after 25.525654secs > [ OK ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22 > (220137 ms) > > > Thanks, > > Jacob Janco > >
