> On 九月 23, 2016, 2:40 a.m., Guangya Liu wrote:
> > It is really weired that the performance of 
> > `SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7`
> >  does not improve much when calling `addSlave`, need check more for why 
> > `addSlave` was same? Without fix, the `addSlave` will call `allocate` for 
> > each agent, but with the fix, only one `allocate` will be called....
> > 
> > ```
> > without fix:
> > [==========] Running 1 test from 1 test case.
> > [----------] Global test environment set-up.
> > [----------] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN      ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 122268us
> > Added 1000 agents in 42.037104secs
> > 
> > With fix:
> > [==========] Running 1 test from 1 test case.
> > [----------] Global test environment set-up.
> > [----------] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN      ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 116107us
> > Added 1000 agents in 41.615396secs
> > ```
> 
> Guangya Liu wrote:
>     Jacob, I did more test with the code on Aug 23, at which I posted some 
> result in this RR, and found that the test result is different, I did 
> following to get Aug 23 code.
>     
>     ```
>     LiuGuangyas-MacBook-Pro:build gyliu$ git checkout 
> 2f78a440ef4201c5b11fb92c225694e84a60369c
>     
>     LiuGuangyas-MacBook-Pro:build gyliu$ git log -1
>     commit 2f78a440ef4201c5b11fb92c225694e84a60369c
>     Author: Gilbert Song <[email protected]>
>     Date:   Mon Aug 22 13:00:58 2016 -0700
>     
>         Fixed potential flakiness in ROOT_RecoverOrphanedPersistentVolume.
>     
>         Review: https://reviews.apache.org/r/51271/
>     ```
>     
>     The test result seems still same as now (without your patch and the code 
> is get from Aug 23):
>     
>     ```
>     [==========] Running 1 test from 1 test case.
>     [----------] Global test environment set-up.
>     [----------] 1 test from 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
>     [ RUN      ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
>     Using 1000 agents and 6000 frameworks
>     Added 6000 frameworks in 144272us
>     Added 1000 agents in 43.107001secs
>     ```
>     
>     But anyway, I think that we need find out why the performance for 
> `addSlave` was not improved based on your patch.
> 
> Jacob Janco wrote:
>     Yes agreed, per our Slack discussions, I'll look into this. Thanks for 
> posting the followup.
> 
> Benjamin Mahler wrote:
>     `addSlave()` is asynchronous and we do not wait for all of the 
> `addSlave()` futures to complete, so any speedup in `addSlave()` will only 
> affect the next caller that waits for a result from the allocator.
> 
> Benjamin Mahler wrote:
>     Ah I missed that we do a `Clock::settle()`, nevermind :)

Some thinking for why `addSlave` does not improve much...

Without Jacob's patch, the logic woule be:

```
addSlave -> allocate the single slave
addSlave -> allocate the single slave
addSlave -> allocate the single slave
...
addSlave -> allocate the single slave
```

With Jacob's patch, the logic would be:

```
addSlave
addSlave
addSlave
...
addSlave - > allocate for **all** of the slaves
```

The time elapsed by `allocate a single slave N times` with `allocate N slaves 
in one allocate` request should not different much, the only difference is one 
is looping the event queue while another is looping in allocator, that's why 
there are not enough performance change for this.

But this will impact a lot when adding frameworks or some other events in 
allocator which will call `allocate(slaves)`, one proposal is we may need to 
add some new benchmark test cases which do the following logic, the following 
logic will trigger each `addframework` operation call `allocate(slaves)` 
without Jacob's patch, but will only call `allocate(slaves)` one time with 
Jacob's patch.

```
1) Add slaves first
2) Add frameworks
```

We may get some performance improvement with above case.

Currently, all of the benchmark test are using 

```
1) Add frameworks
2) Add agents
```

That's why not much performance improvement...


- Guangya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/#review150123
-----------------------------------------------------------


On 九月 23, 2016, 4:32 p.m., Jacob Janco wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51027/
> -----------------------------------------------------------
> 
> (Updated 九月 23, 2016, 4:32 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus 
> Ma, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
>     https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> - Triggered allocations dispatch allocate() only
>   if there is no pending allocation in the queue.
> - Allocation candidates are accumulated and only
>   cleared when enqueued allocations are processed.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> 2c31471ee0f5d6836393bf87ff9ecfd8df835013 
>   src/master/allocator/mesos/hierarchical.cpp 
> 2d56bd011f2c87c67a02d0ae467a4a537d36867e 
> 
> Diff: https://reviews.apache.org/r/51027/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> note: check without filters depends on https://reviews.apache.org/r/51028
> 
> With new benchmark https://reviews.apache.org/r/49617: 
> Sample output without 51027:
> [ RUN      ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 10000 agents and 3000 frameworks
> Added 3000 frameworks in 57251us
> Added 10000 agents in 3.21345353333333mins
> allocator settled after  1.61236038333333mins
> [       OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (290578 ms)
> 
> Sample output with 51027:
> [ RUN      ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 10000 agents and 3000 frameworks
> Added 3000 frameworks in 39817us
> Added 10000 agents in 3.22860541666667mins
> allocator settled after  25.525654secs
> [       OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (220137 ms)
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>

Reply via email to