On Mon, 2015-12-21 at 09:10 -0500, David Edelsohn wrote:
> On Mon, Dec 21, 2015 at 8:13 AM, Alan Lawrence <[email protected]> wrote:
> > ...the test passes with --param sra-max-scalarization-size-Ospeed.
> >
> > Verified on aarch64 and with stage1 compiler for hppa, powerpc, sparc, s390.
> >
> > On alpha, tree-optimized is:
> >
> > MEM[(int[8] *)&a] = { 0, 1 };
> > MEM[(int[8] *)&a + 8B] = { 2, 3 };
> > MEM[(int[8] *)&a + 16B] = { 4, 5 };
> > MEM[(int[8] *)&a + 24B] = { 6, 7 };
> > _23 = a[0];
> > _29 = a[1];
> > sum_30 = _23 + _29;
> > _36 = a[2];
> > sum_37 = sum_30 + _36;
> >
> > Which is beyond the scope of these changes to DOM to optimize.
> >
> > On powerpc64, the test passes with -mcpu=power8 (the loop is vectorized as a
> > reduction); however, without that, similar code is generated to Alpha (the
> > vectorizer decides the reduction is not worthwhile without SIMD support),
> > and
> > the test fails; hence, I've XFAILed for powerpc, but I think I could
> > condition
> > the XFAIL on powerpc64 && !check_p8vector_hw_available, if preferred?
>
> Fun.
>
> Does it work with -mcpu=power7?
>
> Bill: What GCC DejaGNU incantation would you like to see?
This sounds like more fallout from unaligned accesses being faster on
POWER8 than previous hardware. What about conditioning the XFAIL on
{ powerpc*-*-* && { ! vect_hw_misalign } }
-- does this work properly? Right now that's just an alternative way of
saying what you suggested, but I think will better document the reason
for the XFAIL.
Thanks,
Bill
>
> - David
>