On Mon, Jun 27, 2016 at 09:37:16PM -0500, Segher Boessenkool wrote:
> On Mon, Jun 27, 2016 at 08:08:20PM -0400, Michael Meissner wrote:
> > This patch fixes PR 71667 that I discovered when trying to build the Spec 
> > 2006
> > xalancbmk benchmark for the Power9.  The Altivec indexed memory load/stores
> > need to go before the D-form (register + offset) load/stores, because they 
> > have
> > different syntaxes.
> 
> I'm not sure what that means?  Could you explain a bit more?

Before the patch there were two alternatives:

        Instruction             Dest.   Source
        lxsd %0,%1              %?*wb   %Y
        lxsdx %x0,%y1           %?*wv   %Z

LXSDX was added with ISA 2.06 (power7), and the destination is a VSX register.

LXSD was added with ISA 3.0 (power9), and the destination is an Altivec
register.

LXSD only takes a D-FORM instruction (reg + offset, where the bottom 2 bits of
the offset must be 0).

I had forgotten that %Y could match X-FORM (reg + reg) as well as D-FORM, since
it is used by the LD/STD instructions, using the %X1 print_operand modifier.
However, using %X1 for LXSD to change it to LXSDX for the X-FORM case, since
LXSD only takes an Altivec register (0..31), while LXSDX takes a VSX register
(0..63), and the Altivec registers start at VSX register 32.

So, this patch puts the X-FORM code first, so that it matches before the
D-FORM.

However, in thinking about it, it may still be problematical for PRE_MODIFY,
PRE_INC, and PRE_DEC since LXSD does not have an update form.  DFmode which
used this earlier would not trigger since PRE_MODIFY, etc. are disallowed when
the Altivec registers are enabled.  But that isn't the case with DImode.

I think the thing to do is create yet another memory constraint, that is just
an offsetable address, with the bottom 2 bits 0, and no PRE_MODIFY, etc.

> 
> >     PR target/71667
> >     * config/rs6000/rs6000.md (movdi_internal32): Swap alternatives
> >     for loading Altivec registers so that the indexed case comes
> >     first, and the general case (which includes indexed loads) comes
> >     later.
> 
> So the general case allows indexed loads as well, but that breaks somehow?
> 
> >     PR target/71667
> >     * g++.dg/pr71667.C: New test for PR 71667.
> 
> 20599 lines, can you minimize this a bit?  If not, maybe we should just
> do without testcase here.

I doubt I could minimize it, since it is only this one source so far that has
shown to be failure.  I can delete it if you prefer.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Reply via email to