On Wed, Jul 6, 2011 at 10:55 PM, Hartmann, O. <ohart...@zedat.fu-berlin.de> wrote: > On 07/07/11 06:29, Arnaud Lacombe wrote: >> >> Hi, >> >> On Wed, Jul 6, 2011 at 5:00 PM, Hartmann, O. >> <ohart...@zedat.fu-berlin.de> wrote: >>> >>> On 07/06/11 21:36, Steve Kargl wrote: >>>> >>>> On Wed, Jul 06, 2011 at 03:18:35PM -0400, Arnaud Lacombe wrote: >>>>> >>>>> Hi, >>>>> >>>>> On Wed, Jul 6, 2011 at 12:28 PM, Steve Kargl >>>>> <s...@troutmask.apl.washington.edu> wrote: >>>>>> >>>>>> On Wed, Jul 06, 2011 at 05:29:24PM +0200, O. Hartmann wrote: >>>>>>> >>>>>>> I use SCHED_ULE on all machines, since it is supposed to be >>>>>>> performing >>>>>>> better on multicore boxes, but there are lots of suggestions >>>>>>> switching >>>>>>> back to the old SCHED_4BSD scheduler. >>>>>>> >>>>>> If you are using MPI in numerical codes, then you want >>>>>> to use SCHED_4BSD. ?I've posted numerous times about ULE >>>>>> and its very poor performance when using MPI. >>>>>> >>>>>> http://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/026375.html
>>>>>>>With ULE, 2 Test_mpi jobs are always scheduled on the same core while one >>>>>>>core remains idle. Also, note the difference in the reported load >>>>>>>averages. While possibly not the same issue you're seeing, I noticed a similar problem on 8 and 12 core machines with ULE, specifically with a relatively small number of threads runnable but waiting to run on a busy core while other cores were sitting idle. tdq_idled won't steal threads from a queue unless there are kern.sched.steal_thresh threads in that queue, where steal_thresh = min(fls(mp_ncpus) - 1, 3); ie. on an 8 core system you need 3 threads in the queue before idled steals one. Fortunately you can simply override steal_thresh at run time. 1 works great for me, ymmv. _______________________________________________ freebsd-performance@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-performance To unsubscribe, send any mail to "freebsd-performance-unsubscr...@freebsd.org"