> On Oct. 19, 2017, 6:38 p.m., Benjamin Mahler wrote: > > Thanks Yan! I will dig in soon. > > > > Just some quick questions: > > > > (1) I thought during the meeting you said it was taking a minute, but > > looking at all the benchmark timings they're all under a second? Is it only > > the benchmark setup that's expensive here? > > (2) Is this with the lock free event & run queues? If not, how much do they > > help? > > (3) As an aside, it has come up before, but it would be useful to be able > > to force the messages to go through the remote stack rather than the local > > stack. No need to think about this yet, but just something to keep in mind > > as not being accurate in this benchmark.
1) Yeah looks like it. I used to include the setup time so it was large. 2) Yeah I have used `--enable-optimize --enable-lock-free-run-queue --enable-lock-free-event-queue --enable-last-in-first-out-fixed-size-semaphore`. I could compare with the perf without them. 3) Right right I think we should keep that in mind and we should have tests that cover the remote stack. For the case here I thought it would be a simple and good-enough start since the local stack alright coveres the proto (de)serliazation and the rest of the libprocess optimization that we recently have improved. - Jiang Yan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/63174/#review188799 ----------------------------------------------------------- On Oct. 19, 2017, 4:28 p.m., Jiang Yan Xu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/63174/ > ----------------------------------------------------------- > > (Updated Oct. 19, 2017, 4:28 p.m.) > > > Review request for mesos, Benjamin Mahler, Dmitry Zhuk, and Ilya Pronin. > > > Bugs: MESOS-8098 > https://issues.apache.org/jira/browse/MESOS-8098 > > > Repository: mesos > > > Description > ------- > > The current benchmark is very simple: without framework involvement and > without agent retries but it's possible to add a number of others so I am > creating a new file for them. > > > Diffs > ----- > > src/Makefile.am 936bc49ddfca03b9278ab11b6d317f3ff635cb00 > src/tests/CMakeLists.txt 386e0473c93d0a993248c7818067071d0c761c76 > src/tests/master_benchmarks.cpp PRE-CREATION > > > Diff: https://reviews.apache.org/r/63174/diff/1/ > > > Testing > ------- > > Benchmark based off > https://github.com/apache/mesos/commit/41193181d6b75eeecae2729bf98007d9318e351a > (close to current HEAD). > > ``` > [ RUN ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0 > Reregistered 2000 agents with a total of 500000 running tasks and 500000 > completed tasks in 45.075488ms > [ OK ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0 > (48126 ms) > [ RUN ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1 > Reregistered 2000 agents with a total of 1000000 running tasks and 0 > completed tasks in 14.172361ms > [ OK ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1 > (45979 ms) > [ RUN ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2 > Reregistered 20000 agents with a total of 1000000 running tasks and 0 > completed tasks in 413.508328ms > [ OK ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2 > (49487 ms) > [----------] 3 tests from > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (143596 ms total) > > ... > > [ RUN ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0 > Reregistered 2000 agents with a total of 500000 running tasks and 500000 > completed tasks in 32.787363ms > [ OK ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0 > (48266 ms) > [ RUN ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1 > Reregistered 2000 agents with a total of 1000000 running tasks and 0 > completed tasks in 19.735003ms > [ OK ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1 > (46169 ms) > [ RUN ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2 > Reregistered 20000 agents with a total of 1000000 running tasks and 0 > completed tasks in 321.267267ms > [ OK ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2 > (51550 ms) > [----------] 3 tests from > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (145987 ms total) > ``` > > Benchmark based off > https://github.com/apache/mesos/commit/d9c90bf1d9c8b3a7dcc47be0cb773efff57cfb9d > (before https://issues.apache.org/jira/browse/MESOS-7713 was merged) > ``` > [ RUN ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0 > Reregistered 2000 agents with a total of 500000 running tasks and 500000 > completed tasks in 85.800335ms > [ OK ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0 > (59247 ms) > [ RUN ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1 > Reregistered 2000 agents with a total of 1000000 running tasks and 0 > completed tasks in 35.342066ms > [ OK ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1 > (93662 ms) > [ RUN ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2 > Reregistered 20000 agents with a total of 1000000 running tasks and 0 > completed tasks in 798.738642ms > [ OK ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2 > (116078 ms) > [----------] 3 tests from > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (268987 ms total) > > ... > > [ RUN ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0 > Reregistered 2000 agents with a total of 500000 running tasks and 500000 > completed tasks in 66.270249ms > [ OK ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0 > (59925 ms) > [ RUN ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1 > Reregistered 2000 agents with a total of 1000000 running tasks and 0 > completed tasks in 50.146349ms > [ OK ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1 > (88631 ms) > [ RUN ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2 > Reregistered 20000 agents with a total of 1000000 running tasks and 0 > completed tasks in 807.621964ms > [ OK ] > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2 > (109941 ms) > [----------] 3 tests from > AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (258497 ms total) > ``` > > The recently patches cut down the time by nearly 50%. These were built with > `--enable-optimize`. > > I can also get some flame graphs. > > > Thanks, > > Jiang Yan Xu > >
