On Thu, Sep 12, 2019 at 10:05:43AM -0700, Tim Chen wrote: > On 9/12/19 5:04 AM, Aaron Lu wrote: > > > Well, I have done following tests: > > 1 Julien's test script: https://paste.debian.net/plainh/834cf45c > > 2 start two tagged will-it-scale/page_fault1, see how each performs; > > 3 Aubrey's mysql test: https://github.com/aubreyli/coresched_bench.git > > > > They all show your patchset performs equally well...And consider what > > the patch does, I think they are really doing the same thing in > > different ways. > > > > Aaron, > > The new feature of my new patches attempt to load balance between cores, > and remove imbalance of cgroup load on a core that causes forced idle. > Whereas previous patches attempt for fairness of cgroup between sibling > threads, > so I think the goals are kind of orthogonal and complementary. > > The premise is this, say cgroup1 is occupying 50% of cpu on cpu thread 1 > and 25% of cpu on cpu thread 2, that means we have a 25% cpu imbalance > and cpu is force idled 25% of the time. So ideally we need to remove > 12.5% of cgroup 1 load from cpu thread 1 to sibling thread 2, so they > both run at 37.5% on both thread for cgroup1 load without causing > any force idled time. Otherwise we will try to remove 25% of cgroup1 > load from cpu thread 1 to another core that has cgroup1 load to match. > > This load balance is done in the regular load balance paths. > > Previously for v3, only sched_core_balance made an attempt to pull a cookie > task, and only > in the idle balance path. So if the cpu is kept busy, the cgroup load > imbalance > between sibling threads could last a long time. And the thread fairness > patches for v3 don't help to balance load for such cases. > > The new patches take into actual consideration of the amount of load imbalance > of the same group between sibling threads when selecting task to pull, > and it also prevent task migration that creates > more load imbalance. So hopefully this feature will help when we have > more cores and need load balance across the cores. This tries to help > even cgroup workload between threads to minimize forced idle time, and also > even out load across cores.
Will take a look at your new patches, thanks for the explanation. > In your test, how many cores are on your machine and how many threads did > each page_fault1 spawn off? The test VM has 16 cores and 32 threads. I created 2 tagged cgroups to run page_fault1 and each page_fault1 has 16 processes, like this: $ ./src/will-it-scale/page_fault1_processes -t 16 -s 60