On Thu Nov 10, 2022 at 10:42 AM AEST, Jordan Niethe wrote:
> On Thu, 2022-07-28 at 16:31 +1000, Nicholas Piggin wrote:
> [resend as utf-8, not utf-7]
> > Having all CPUs poll the lock word for the owner CPU that should be
> > yielded to defeats most of the purpose of using MCS queueing for
> > scalability. Yet it may be desirable for queued waiters to to yield
> > to a preempted owner.
> >
> > s390 addreses this problem by having queued waiters sample the lock
> > word to find the owner much less frequently. In this approach, the
> > waiters never sample it directly, but the queue head propagates the
> > owner CPU back to the next waiter if it ever finds the owner has
> > been preempted. Queued waiters then subsequently propagate the owner
> > CPU back to the next waiter, and so on.
> >
> > Disable this option by default for now, i.e., no logical change.
> > ---
> > arch/powerpc/lib/qspinlock.c | 85 +++++++++++++++++++++++++++++++++++-
> > 1 file changed, 84 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c
> > index 94f007f66942..28c85a2d5635 100644
> > --- a/arch/powerpc/lib/qspinlock.c
> > +++ b/arch/powerpc/lib/qspinlock.c
> > @@ -12,6 +12,7 @@
> > struct qnode {
> > struct qnode *next;
> > struct qspinlock *lock;
> > + int yield_cpu;
> > u8 locked; /* 1 if lock acquired */
> > };
> >
> > @@ -28,6 +29,7 @@ static int HEAD_SPINS __read_mostly = (1<<8);
> > static bool pv_yield_owner __read_mostly = true;
> > static bool pv_yield_allow_steal __read_mostly = false;
> > static bool pv_yield_prev __read_mostly = true;
> > +static bool pv_yield_propagate_owner __read_mostly = true;
>
> This also seems to be enabled by default.
>
> >
> > static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes);
> >
> > @@ -257,13 +259,66 @@ static __always_inline void
> > yield_head_to_locked_owner(struct qspinlock *lock, u
> > __yield_to_locked_owner(lock, val, paravirt, clear_mustq);
> > }
> >
> > +static __always_inline void propagate_yield_cpu(struct qnode *node, u32
> > val, int *set_yield_cpu, bool paravirt)
> > +{
> > + struct qnode *next;
> > + int owner;
> > +
> > + if (!paravirt)
> > + return;
> > + if (!pv_yield_propagate_owner)
> > + return;
> > +
> > + owner = get_owner_cpu(val);
> > + if (*set_yield_cpu == owner)
> > + return;
> > +
> > + next = READ_ONCE(node->next);
> > + if (!next)
> > + return;
> > +
> > + if (vcpu_is_preempted(owner)) {
>
> Is there a difference about using vcpu_is_preempted() here
> vs checking bit 0 in other places?
Yeah we weren't yielding here so didn't have to load the yield count
explicitly.
> > + next->yield_cpu = owner;
> > + *set_yield_cpu = owner;
> > + } else if (*set_yield_cpu != -1) {
>
> It might be worth giving the -1 CPU a #define.
It's fairly standard to use -1 as a null value for cpu.
>
> > + next->yield_cpu = owner;
> > + *set_yield_cpu = owner;
> > + }
> > +}
>
> Does this need to pass set_yield_cpu by reference? Couldn't it's new value be
> returned? To me it makes it more clear the function is used to change
> set_yield_cpu. I think this would work:
>
> int set_yield_cpu = -1;
>
> static __always_inline int propagate_yield_cpu(struct qnode *node, u32 val,
> int set_yield_cpu, bool paravirt)
> {
> struct qnode *next;
> int owner;
>
> if (!paravirt)
> goto out;
> if (!pv_yield_propagate_owner)
> goto out;
>
> owner = get_owner_cpu(val);
> if (set_yield_cpu == owner)
> goto out;
>
> next = READ_ONCE(node->next);
> if (!next)
> goto out;
>
> if (vcpu_is_preempted(owner)) {
> next->yield_cpu = owner;
> return owner;
> } else if (set_yield_cpu != -1) {
> next->yield_cpu = owner;
> return owner;
> }
>
> out:
> return set_yield_cpu;
> }
>
> set_yield_cpu = propagate_yield_cpu(... set_yield_cpu ...);
I think I prefer as is, because the caller doesn't use it anywhere. It
looks more like some temporary storage to be used by the function over
multiple calls this way.
Thanks,
Nick