On Wed, Sep 19, 2012 at 7:03 AM, Andriy Gapon <a...@freebsd.org> wrote: > on 19/09/2012 02:01 Attilio Rao said the following: >> On Tue, Sep 18, 2012 at 8:07 PM, Andriy Gapon <a...@freebsd.org> wrote: >>> on 18/09/2012 19:50 Attilio Rao said the following: >>>> On 9/18/12, Andriy Gapon <a...@freebsd.org> wrote: >>>>> >>>>> Here is a snippet that demonstrates the issue on a supposedly fully loaded >>>>> 2-processor system: >>>>> >>>>> 136794 0 3670427870244462 KTRGRAPH group:"thread", id:"Xorg tid 102818", >>>>> state:"running", attributes: prio:122 >>>>> >>>>> 136793 0 3670427870241000 KTRGRAPH group:"thread", id:"cc1plus tid >>>>> 111916", >>>>> state:"yielding", attributes: prio:183, wmesg:"(null)", lockname:"(null)" >>>>> >>>>> 136792 1 3670427870240829 KTRGRAPH group:"thread", id:"idle: cpu1 tid >>>>> 100004", >>>>> state:"running", attributes: prio:255 >>>>> >>>>> 136791 1 3670427870239520 KTRGRAPH group:"load", id:"CPU 1 load", >>>>> counter:0, >>>>> attributes: none >>>>> >>>>> 136790 1 3670427870239248 KTRGRAPH group:"thread", id:"firefox tid >>>>> 113473", >>>>> state:"blocked", attributes: prio:122, wmesg:"(null)", lockname:"unp_mtx" >>>>> >>>>> 136789 1 3670427870237697 KTRGRAPH group:"load", id:"CPU 0 load", >>>>> counter:2, >>>>> attributes: none >>>>> >>>>> 136788 1 3670427870236394 KTRGRAPH group:"thread", id:"firefox tid >>>>> 113473", >>>>> point:"wokeup", attributes: linkedto:"Xorg tid 102818" >>>>> >>>>> 136787 1 3670427870236145 KTRGRAPH group:"thread", id:"Xorg tid 102818", >>>>> state:"runq add", attributes: prio:122, linkedto:"firefox tid 113473" >>>>> >>>>> 136786 1 3670427870235981 KTRGRAPH group:"load", id:"CPU 1 load", >>>>> counter:1, >>>>> attributes: none >>>>> >>>>> 136785 1 3670427870235707 KTRGRAPH group:"thread", id:"Xorg tid 102818", >>>>> state:"runq rem", attributes: prio:176 >>>>> >>>>> 136784 1 3670427870235423 KTRGRAPH group:"thread", id:"Xorg tid 102818", >>>>> point:"prio", attributes: prio:176, new prio:122, linkedto:"firefox tid >>>>> 113473" >>>>> >>>>> 136783 1 3670427870202392 KTRGRAPH group:"thread", id:"firefox tid >>>>> 113473", >>>>> state:"running", attributes: prio:104 >>>>> >>>>> See how how the Xorg thread was forced from CPU 1 to CPU 0 where it >>>>> preempted >>>>> cc1plus thread (I do have preemption enabled) only to leave CPU 1 with >>>>> zero >>>>> load. >>>> >>>> I think that the idea is bright, but I have reservations against the >>>> implementation because it seems to me there are too many layering >>>> violations. >>> >>> Just one - for a layer between tunrstile and scheduler :-) >>> But I agree. >>> >>>> What is suggest is somewhat summarized like that: >>>> - Add a new SRQ_WILLSLEEP or the name you prefer >>>> - Add a new "flags" argument to sched_lend_prio() (both ule and 4bsd) >>>> and sched_thread_priority (ule only) >>>> - sched_thread_priority() will pass down the new flag to sched_add() >>>> which passed down to sched_pickcpu(). >>>> >>>> This way sched_pickcpu() has the correct knowledge of what is going on >>>> and it can make the right decision. You likely don't need to lower the >>>> tdq_load at that time either this way, because sched_pickcpu() can >>>> just adjust it locally for its decision. >>>> >>>> What do you think? >>> >>> This sounds easy but it is not quite so given the implementation of >>> sched_pickcpu and sched_lowest. This is probably more work than I am able >>> to >>> take now. >> >> I think actually that the attached patch should do what you need. Of >> course there is more runqueue locking, due to the tdq_load fondling, >> but I'm not sure it is really a big deal. >> I didn't test it yet, as I understand you have a test case, so maybe >> you can. However if Jeff agrees I can send the patch to flo@ for >> performance evaluation. >> >> Thoughts? > > Originally I had a similar patch with a small difference that I did tdq_load > manipulations in sched_thread_priority around sched_add call. That patch > produced a deadlock because of the need for TDQ_LOCK.
It is hard for me to tell if this is subject to same issues because I do not entirely understand where the locking was happening in your patch. Can you try testing this with your already KTR instrumented test and possibly report? Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"