On Thu, Feb 4, 2016 at 12:53 PM, Alan Lawrence <alan.lawre...@foss.arm.com> wrote: > On 04/02/16 09:53, Dominik Vogt wrote: >> >> On Wed, Feb 03, 2016 at 11:41:02AM +0000, Alan Lawrence wrote: >>> >>> On 26/01/16 12:23, Dominik Vogt wrote: >>>> >>>> On Mon, Dec 21, 2015 at 01:13:28PM +0000, Alan Lawrence wrote: >>>>> >>>>> ...the test passes with --param sra-max-scalarization-size-Ospeed. >>>>> >>>>> Verified on aarch64 and with stage1 compiler for hppa, powerpc, sparc, >>>>> s390. >>>> >>>> >>>> How did you test this on s390? For me, the test still fails >>>> unless I add -march=z13 (s390x). >>> >>> >>> Sorry for the slow response, was away last week. On x86 host, I built a >>> compiler >>> >>> configure --enable-languages=c,c++,lto --target=s390-none-linux-gnu >> >> ^^^^ >> Looks like the test fails only on s390x (64-Bit ) without >> -march=z13 but works on s390 (31-Bit). > > > Ah, yes, I see. Loop is not unrolled for dom2, then vectorizer kicks in but > vectorizes only the initialization/loading, and dom can't see the redundancy > in > > MEM[(int[8] *)&a] = { 0, 1 }; > ... > _23 = a[0]; > > as they aren't reading equivalent chunks of memory.
Yep, only FRE/PRE know enough tricks to CSE this. Richard. > Same problem as on Alpha, and powerpc64 (without -mcpu=power7/power8). > > Powerpc chose to XFAIL rather than add -mcpu=power7, but I'm OK with any > testsuite workaround here, such as yours. > > Cheers, Alan