On Mon, 2015-12-21 at 09:10 -0500, David Edelsohn wrote: > On Mon, Dec 21, 2015 at 8:13 AM, Alan Lawrence <alan.lawre...@arm.com> wrote: > > ...the test passes with --param sra-max-scalarization-size-Ospeed. > > > > Verified on aarch64 and with stage1 compiler for hppa, powerpc, sparc, s390. > > > > On alpha, tree-optimized is: > > > > MEM[(int[8] *)&a] = { 0, 1 }; > > MEM[(int[8] *)&a + 8B] = { 2, 3 }; > > MEM[(int[8] *)&a + 16B] = { 4, 5 }; > > MEM[(int[8] *)&a + 24B] = { 6, 7 }; > > _23 = a[0]; > > _29 = a[1]; > > sum_30 = _23 + _29; > > _36 = a[2]; > > sum_37 = sum_30 + _36; > > > > Which is beyond the scope of these changes to DOM to optimize. > > > > On powerpc64, the test passes with -mcpu=power8 (the loop is vectorized as a > > reduction); however, without that, similar code is generated to Alpha (the > > vectorizer decides the reduction is not worthwhile without SIMD support), > > and > > the test fails; hence, I've XFAILed for powerpc, but I think I could > > condition > > the XFAIL on powerpc64 && !check_p8vector_hw_available, if preferred? > > Fun. > > Does it work with -mcpu=power7? > > Bill: What GCC DejaGNU incantation would you like to see?
This sounds like more fallout from unaligned accesses being faster on POWER8 than previous hardware. What about conditioning the XFAIL on { powerpc*-*-* && { ! vect_hw_misalign } } -- does this work properly? Right now that's just an alternative way of saying what you suggested, but I think will better document the reason for the XFAIL. Thanks, Bill > > - David >