On 05/02/2014 01:58 AM, Mike Galbraith wrote: > On Fri, 2014-05-02 at 07:32 +0200, Mike Galbraith wrote: >> On Fri, 2014-05-02 at 00:42 -0400, Rik van Riel wrote: >>> Currently sync wakeups from the wake_affine code cannot work as >>> designed, because the task doing the sync wakeup from the target >>> cpu will block its wakee from selecting that cpu. >>> >>> This is despite the fact that whether or not the wakeup is sync >>> determines whether or not we want to do an affine wakeup... >> >> If the sync hint really did mean we ARE going to schedule RSN, waking >> local would be a good thing. It is all too often a big fat lie. > > One example of that is say pgbench. The mother of all work (server > thread) for that load wakes with sync hint. Let the server wake the > first of a small herd CPU affine, and that first wakee then preempt the > server (mother of all work) that drives the entire load. > > Byebye throughput. > > When there's only one wakee, and there's really not enough overlap to at > least break even, waking CPU affine is a great idea. Even when your > wakees only run for a short time, if you wake/get_preempted repeat, the > load will serialize.
I see a similar issue with specjbb2013, with 4 backend and 4 frontend JVMs on a 4 node NUMA system. The NUMA balancing code nicely places the memory of each JVM on one NUMA node, but then the wake_affine code will happily run all of the threads anywhere on the system, totally ruining memory locality. The front end and back end only exchange a few hundred messages a second, over loopback tcp, so the switching rate between threads is quite low... I wonder if it would make sense for wake_affine to be off by default, and only switch on when the right conditions are detected, instead of having it on by default like we have now? I have some ideas on that, but I should probably catch some sleep before trying to code them up :) Meanwhile, the test patch that I posted may help us figure out whether the "sync" option in the current wake_affine code does anything useful. -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/