On Sun, Sep 14, 2014 at 10:13:19PM +0000, Alexander Motin wrote:
> Author: mav
> Date: Sun Sep 14 22:13:19 2014
> New Revision: 271604
> URL: http://svnweb.freebsd.org/changeset/base/271604
> 
> Log:
>   Add couple memory barries to serialize tdq_cpu_idle and tdq_load accesses.
Serialize what against what ?

It seems what you do is just ensuring that the write to tdq_cpu_idle in
sched_idletd() become visible faster than it was before your patch.
Memory barrier after the assignment of dq->tdq_cpu_idle = 1 flushes
write buffers, which makes the tdq_notify() to see the write immediately.
Am I right ?

If true, this must be commented to explain the stray barriers appearing
in the code.

>   
>   This change fixes transient performance drops in some of my benchmarks,
>   vanishing as soon as I am trying to collect any stats from the scheduler.
>   It looks like reordered access to those variables sometimes caused loss of
>   IPI_PREEMPT, that delayed thread execution until some later interrupt.
>   
>   MFC after:  3 days
> 
> Modified:
>   head/sys/kern/sched_ule.c
> 
> Modified: head/sys/kern/sched_ule.c
> ==============================================================================
> --- head/sys/kern/sched_ule.c Sun Sep 14 22:10:35 2014        (r271603)
> +++ head/sys/kern/sched_ule.c Sun Sep 14 22:13:19 2014        (r271604)
> @@ -1037,6 +1037,7 @@ tdq_notify(struct tdq *tdq, struct threa
>       ctd = pcpu_find(cpu)->pc_curthread;
>       if (!sched_shouldpreempt(pri, ctd->td_priority, 1))
>               return;
> +     mb();
>       if (TD_IS_IDLETHREAD(ctd)) {
>               /*
>                * If the MD code has an idle wakeup routine try that before
> @@ -2640,6 +2641,7 @@ sched_idletd(void *dummy)
>  
>               /* Run main MD idle handler. */
>               tdq->tdq_cpu_idle = 1;
> +             mb();
>               cpu_idle(switchcnt * 4 > sched_idlespinthresh);
>               tdq->tdq_cpu_idle = 0;
>  
I suspect that what you are trying to do could be achieved by using
the FreeBSD API, instead of compat shims, in the following way.

diff --git a/sys/kern/sched_ule.c b/sys/kern/sched_ule.c
index 0a63c01..1ffac22 100644
--- a/sys/kern/sched_ule.c
+++ b/sys/kern/sched_ule.c
@@ -1042,7 +1042,8 @@ tdq_notify(struct tdq *tdq, struct thread *td)
                 * If the MD code has an idle wakeup routine try that before
                 * falling back to IPI.
                 */
-               if (!tdq->tdq_cpu_idle || cpu_idle_wakeup(cpu))
+               if (!atomic_load_acq_int(&tdq->tdq_cpu_idle) ||
+                   cpu_idle_wakeup(cpu))
                        return;
        }
        tdq->tdq_ipipending = 1;
@@ -2639,7 +2640,7 @@ sched_idletd(void *dummy)
                        continue;
 
                /* Run main MD idle handler. */
-               tdq->tdq_cpu_idle = 1;
+               atomic_add_rel_int(&tdq->tdq_cpu_idle, 1);
                cpu_idle(switchcnt * 4 > sched_idlespinthresh);
                tdq->tdq_cpu_idle = 0;
 

Attachment: pgpX7BcwB_8v0.pgp
Description: PGP signature

Reply via email to