On Mon, May 26, 2014 at 1:14 PM, erik quanstrom <quans...@quanstro.net> wrote:
> so, i've done a little bit more work characterizing the performance
> of the scheduler correctness changes, and i know have some understanding
> on why e.g. ping times are a bit slower.
>
> the old code essentially let processor 0 spin in runproc, other processors 
> called
> halt.  the new code uses monmwait to wait for a change on all processors.
> this has some significant impacts on performance and power use.  for example,
> on my test box with 4c/8t:
>
>         spin/halt               monmwait        spin/monmwait
> ping    8µs                    14µs            8µs             # ip/ping -n10 
> $sysname
> mk      6.26s                 3.98s           3.80            # make nix 
> kernel
> fans    audible               silent           audible
> δpower  -                      -24w            0                 # resolution 
> = .1A = 12w @ 120v)
>
> this seems to indicate the latency is all in runproc(), and not waiting for 
> things
> to be ready and assuming they will be has a big performance boost.
>
> (the third column, testing spin on mach 0, plus monmwait on the others was 
> done
> to tell if monmwait has high latency or not.)
>
> i'd really be interested to see what this does on 24c/48t machines.  something
> tells me the performance impacts would be huge, and different.
>
> - erik
>
> ---
> ps. hzsched in the distribution is 10% off for HZ=100, since
> schedticks = m->ticks + HZ/10, and delaysched tests
> for > not the expected >=.
>

Nice. Excited to see how a cleaned up + simplified runproc() and the
per-Mach queues could also change things. Any reason why the ping test
w/ monmwait wasn't consistent with the performance improvement in
other areas?

Reply via email to