On 28/05/2019 11:32, Juergen Gross wrote:
> @@ -1870,8 +1871,19 @@ int schedule_cpu_switch(unsigned int cpu, struct 
> cpupool *c)
>      old_lock = pcpu_schedule_lock_irq(cpu);
>  
>      vpriv_old = idle->sched_priv;
> -    ppriv_old = per_cpu(schedule_data, cpu).sched_priv;
> -    sched_switch_sched(new_ops, cpu, ppriv, vpriv);
> +    ppriv_old = sd->sched_priv;
> +    new_lock = sched_switch_sched(new_ops, cpu, ppriv, vpriv);
> +
> +    per_cpu(scheduler, cpu) = new_ops;
> +    sd->sched_priv = ppriv;
> +
> +    /*
> +     * (Re?)route the lock to the per pCPU lock as /last/ thing. In fact,
> +     * if it is free (and it can be) we want that anyone that manages
> +     * taking it, finds all the initializations we've done above in place.
> +     */
> +    smp_mb();

I realise you're just moving existing code, but this barrier sticks out
like a sore thumb.

A full memory barrier is a massive overhead for what should be
smp_wmb().  The matching barrier is actually hidden in the implicit
semantics of managing to lock sd->schedule_lock (which is trial an error
anyway), but the only thing that matters here is that all other written
data is in place first.

Beyond that, local causality will cause all reads to be in order (not
that the are important) due to logic dependencies.  Any that miss out on
this are a optimisation-waiting-to-happen as the compiler could elide
them fully.

~Andrew

> +    sd->schedule_lock = new_lock;
>  
>      /* _Not_ pcpu_schedule_unlock(): schedule_lock may have changed! */
>      spin_unlock_irq(old_lock);


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to