On Mon, 2015-12-21 at 09:10 -0500, David Edelsohn wrote:
> On Mon, Dec 21, 2015 at 8:13 AM, Alan Lawrence <alan.lawre...@arm.com> wrote:
> > ...the test passes with --param sra-max-scalarization-size-Ospeed.
> >
> > Verified on aarch64 and with stage1 compiler for hppa, powerpc, sparc, s390.
> >
> > On alpha, tree-optimized is:
> >
> >   MEM[(int[8] *)&a] = { 0, 1 };
> >   MEM[(int[8] *)&a + 8B] = { 2, 3 };
> >   MEM[(int[8] *)&a + 16B] = { 4, 5 };
> >   MEM[(int[8] *)&a + 24B] = { 6, 7 };
> >   _23 = a[0];
> >   _29 = a[1];
> >   sum_30 = _23 + _29;
> >   _36 = a[2];
> >   sum_37 = sum_30 + _36;
> >
> > Which is beyond the scope of these changes to DOM to optimize.
> >
> > On powerpc64, the test passes with -mcpu=power8 (the loop is vectorized as a
> > reduction); however, without that, similar code is generated to Alpha (the
> > vectorizer decides the reduction is not worthwhile without SIMD support), 
> > and
> > the test fails; hence, I've XFAILed for powerpc, but I think I could 
> > condition
> > the XFAIL on powerpc64 && !check_p8vector_hw_available, if preferred?
> 
> Fun.
> 
> Does it work with -mcpu=power7?
> 
> Bill: What GCC DejaGNU incantation would you like to see?

This sounds like more fallout from unaligned accesses being faster on
POWER8 than previous hardware.  What about conditioning the XFAIL on

{ powerpc*-*-* && { ! vect_hw_misalign } }

-- does this work properly?  Right now that's just an alternative way of
saying what you suggested, but I think will better document the reason
for the XFAIL.

Thanks,
Bill

> 
> - David
> 


Reply via email to