On Fri, Jan 28, 2022 at 10:06:04PM -0800, Jonathan Thornburg wrote: > In <https://marc.info/?l=openbsd-misc&m=164212677602970&w=1> I wrote > > I've just noticed something odd about the scheduling of processes with > > varying 'nice' values (7.0-stable/amd64, GENERIC.MP): it appears that > > processes with 'nice 20' are given more favorable scheduling than those > > with 'nice 10', which is exactly the opposite of what I'd expect [[...]] > > In <https://marc.info/?l=openbsd-misc&m=164214220808789&w=1>, > Otto Moerbeek replied > > Are youre processes multithreaded?? Check with top -H. > > I apologise for the long delay in followup (unrelated work crises). > > No, they're not multithreaded -- they're all instances of a (the same) > single-threaded "number-crunching" code written in C++ (compiled by > clang 11.1.0 from ports). Here's the first part of the output of > 'top -H -s -i -s1' for another set of such processes I have running > right now: > > 398 threads: 4 running, 390 idle, 4 on processor up > 21:36 > CPU0: 0.0% user, 96.0% nice, 0.0% sys, 0.0% spin, 4.0% intr, 0.0% idle > CPU1: 0.0% user, 100% nice, 0.0% sys, 0.0% spin, 0.0% intr, 0.0% idle > CPU2: 1.0% user, 99.0% nice, 0.0% sys, 0.0% spin, 0.0% intr, 0.0% idle > CPU3: 0.0% user, 100% nice, 0.0% sys, 0.0% spin, 0.0% intr, 0.0% idle > Memory: Real: 2841M/8293M act/tot Free: 7195M Cache: 4179M Swap: 0K/34G > > PID TID PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND > 88761 356466 84 10 21M 24M onproc/3 - 16:36 99.02% smp-O3 > 87643 189282 104 20 39M 42M run/2 - 14:38 98.93% smp-O3 > 4015 151196 104 20 40M 43M onproc/0 - 4:47 51.27% smp-O3 > 92541 618295 84 10 22M 24M run/1 - 4:48 49.85% smp-O3 > 26221 169495 84 10 21M 24M onproc/1 - 9:55 49.17% smp-O3 > 7827 115940 104 20 39M 42M run/0 - 11:45 47.31% smp-O3 > 61507 342772 2 0 41M 87M sleep/0 poll 9:42 0.05% Xorg > 61507 413182 2 0 41M 87M sleep/2 poll 0:29 0.05% Xorg > > In this case I have 6 CPU-bound processes, 3 smaller ones started with > 'nice -n 10 ...' and 3 larger ones started with 'nice -n 20', all running > on a 4-core machine. I would have expected the three nice-10 processes > to get more CPU than the three nice-20 proesses, but clearly that's not > what's happening. > > Looking at 'iostat 5' I see that I/O is pretty low (around 0.5 MB/s or > less). > > I wonder if NaN handling might be causing kernel traps which change > the scheduling priority?
I tried the same using a tight loop program, running multiple instances with different NICE values. And I'm seeing similar things. At the moment I *think* that the scheduling decision is is per-CPU. That means that the processes that happen to land on a single CPU compete for the time slices based on their NICE (or actually computed dynamic PRI based on NICE and other factors). So if two CPU bound processes with the same NICE land on the same CPU, they get both 50% of that CPU, independent of the NICE number of other processes on other CPUs. At least, with that hypothesis I could explain the numbers I saw. In general processes stick to a CPU, but once in a while they are moved to another. Maybe a kernel hacker who understands the scheduling can chime in and educate us if my hypothesis makes sense. -Otto