Richard Henderson wrote:
> On 2012-07-30 07:09, Ulrich Weigand wrote:
> > This seems to disable use of ICM / STCM to perform byte or
> > aligned halfword access.  Why is this necessary?  Those operations
> > are supposed to provide the required operand consistency ...
> 
> Because MEM_P for cmp and new_rtx are always false.  The expander
> always requests register_operand for those.  I suppose I could back
> out merging those cases into the macro.

Right, that's one of the reasons why we had two separate macros
for sync_compare_and_swap ...

> I presume a good test case to examine for ICM is with such an operand
> coming from a global.  What about STCM?  I don't see the output from
> sync_compare_and_swap ever being allowed in memory...

Actually, it's only ICM that is of interest here; it should get used when
either the comparison value or the "new" value come from a memory location,
e.g. a global.  Sorry, I was confused about STCM ...

> > This seems to force DImode accesses through floating-point
> > registers, which is quite inefficient.  Why not allow LM/STM?
> > Those are supposed to provide doubleword consistency if the
> > operand is sufficiently aligned ...
> 
> ... because I only looked at the definition of LM which itself
> doesn't mention consistency, and the definition of LPQ which talks
> about LM not being suitable for quadword consistency, and came to
> the wrong conclusion.
> 
> So now, looking at movdi_31, I see two problems that prevent just
> using a "normal" move for the atomic_load/store_di: the o/d and d/b
> alternatives which are split.  Is there some specific goodness that
> those alternatives provide that is not had by reloading into the
> Q/S memory patterns?

Well, they are there as splitters because reload assumes all moves
are handled somewhere, either by the mov pattern or else via a
secondary reload.  I've implemented all moves that *can* be
implemented without an extra register via splitters on the
mov pattern, and only those that absolute require the extra
register via secondary reload ...

Given that, it's probably best to use a separate instruction for
the DImode atomic moves after all, but allow GPRs using LM/STM.
(Only for Q/S constraint type addresses.  For those instructions,
we have to reload the address instead of performing two moves.)

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com

Reply via email to