On Fri, Oct 06, 2017 at 11:45:10AM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Thu, Oct 05, 2017 at 06:14:14PM -0400, Michael Meissner wrote:
> > This patch adds support for most of the PowerPC ISA 3.0 Atomic Memory 
> > Operation
> > instructions, listed in section 4.5 of the manual.  Currently these 
> > functions
> > are done via extended asm.  At some point in the future, I will probably 
> > move
> > the inner part of the patch to use new built-in functions to replace the
> > extended asm.
> 
> Just some asm comments for now, have to think about the rest a bit more.
> 
> > +#define _AMO_LD_SIMPLE(NAME, TYPE, OPCODE, FC)                             
> > \
> > +static __inline__ TYPE                                                     
> > \
> > +NAME (TYPE *_PTR, TYPE _VALUE)                                             
> > \
> > +{                                                                  \
> > +  unsigned __int128 _TMP;                                          \
> > +  TYPE _RET;                                                               
> > \
> > +  __asm__ volatile ("mr %L1,%3\n"                                  \
> > +               "\t" OPCODE " %1,%4,%5\t\t# %0\n"                   \
> > +               "\tmr %2,%1\n"                                      \
> > +               : "+Q" (_PTR[0]), "=&r" (_TMP), "=r" (_RET)         \
> > +               : "r" (_VALUE), "b" (&_PTR[0]), "n" (FC));          \
> > +  return _RET;                                                             
> > \
> > +}
> 
> You don't need arg 4: "%P0" is the same thing.

I didn't think there was a print_opcode that did what I wanted, I'll modify it.

> Do you really need the mr insns?  Can't you express that in the
> arguments?  Perhaps using a union of __int128 with something that
> is two long ints?

My first attempt resulted in the compiler doing move directs to form the
__int128 in the vector unit and then move directs back.  So, I figured having
two mr's was a simple price to pay.

I do have thoughts to replace the two with a built-in (but keep amo.h and the
names), and we can probably eliminate some of the moves.

> > +#define _AMO_ST_SIMPLE(NAME, TYPE, OPCODE, FC)                             
> > \
> > +static __inline__ void                                                     
> > \
> > +NAME (TYPE *_PTR, TYPE _VALUE)                                             
> > \
> > +{                                                                  \
> > +  __asm__ volatile (OPCODE " %1,%2,%3\t\t# %0"                             
> > \
> > +               : "+Q" (_PTR[0])                                    \
> > +               : "r" (_VALUE), "b" (&_PTR[0]), "n" (FC));          \
> > +  return;                                                          \
> > +}
> 
> Same for "b" (arg 2) here.
> 
> 
> Segher
> 

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Reply via email to