On Mon, Feb 10, 2014 at 02:10:23PM +1100, Benjamin Herrenschmidt wrote: > On Fri, 2014-02-07 at 17:58 +0100, Torsten Duwe wrote: > > typedef struct { > > - volatile unsigned int slock; > > -} arch_spinlock_t; > > + union { > > + __ticketpair_t head_tail; > > + struct __raw_tickets { > > +#ifdef __BIG_ENDIAN__ /* The "tail" part should be in the MSBs */ > > + __ticket_t tail, head; > > +#else > > + __ticket_t head, tail; > > +#endif > > + } tickets; > > + }; > > +#if defined(CONFIG_PPC_SPLPAR) > > + u32 holder; > > +#endif > > +} arch_spinlock_t __aligned(4); > > That's still broken with lockref (which we just merged). > > We must have the arch_spinlock_t and the ref in the same 64-bit word > otherwise it will break.
Well, as far as I can see you'll just not be able to USE_CMPXCHG_LOCKREF -- with the appropriate performance hit -- the code just falls back into lock&ref on pSeries. What again was the intention of directed yield in the first place...? > We can make it work in theory since the holder doesn't have to be > accessed atomically, but the practicals are a complete mess ... > lockref would essentially have to re-implement the holder handling > of the spinlocks and use lower level ticket stuff. > > Unless you can find a sneaky trick ... :-( What if I squeeze the bits a little? 4k vCPUs, and 256 physical, as a limit to stay within 32 bits? At the cost that unlock may become an ll/sc operation again. I could think about a trick against that. But alas, hw_cpu_id is 16 bit, which makes a lookup table neccessary :-/ Doing another round of yields for lockrefs now doesn't sound so bad any more. Opinions, anyone? Torsten _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev