On Thu, 11 May 2017, Uros Bizjak wrote: > On Thu, May 11, 2017 at 4:02 PM, Richard Biener <rguent...@suse.de> wrote: > > >> > Uros added the testcase in 2008 -- I think if we want to have a testcase > >> > for the original issue we need a different one. Or simply remove > >> > the testcase. > >> > >> No, there is something going on in the testcase: > >> > >> .L3: > >> movq (%ecx,%eax,8), %mm1 > >> paddq (%ebx,%eax,8), %mm1 > >> addl $1, %eax > >> movq %mm1, %mm0 > >> cmpl %eax, %edx > >> jne .L3 > >> > >> > >> The compiler should allocate %mm0 to movq and paddq to avoid %mm1 -> > >> %mm0 move. These are all movv1di patterns (they shouldn't interfere > >> with movdi), and it is not clear to me why RA allocates %mm1 instead > >> of %mm0. > > > > In any case the testcase is no longer testing what it tested as the > > input to RA is now different. The testcase doesn't make much sense: > > Following is the cleaned testcase: > > --cut here-- > /* { dg-do compile { target ia32 } } */ > /* { dg-options "-O2 -msse2 -mtune=core2" } */ > /* { dg-additional-options "-mno-vect8-ret-in-mem" { target *-*-vxworks* } } > */ > /* { dg-additional-options "-mabi=sysv" { target x86_64-*-mingw* } } */ > > #include <mmintrin.h> > > typedef __SIZE_TYPE__ size_t; > > __m64 > unsigned_add3 (const __m64 * a, const __m64 * b, size_t count) > { > __m64 sum = { 0, 0 }; > > if (count > 0) > sum = _mm_add_si64 (a[count-1], b[count-1]); > > return sum; > } > > /* { dg-final { scan-assembler-times "movq\[ \\t\]+\[^\n\]*%mm" 1 } } */ > -- cut here-- > > The testcase still tests that only one movq is generated (gcc-4.1 > generated three). However, I have disabled the test on x86_64, since > x86_64 returns mmx values in XMM registers, and MMX -> XMM moves > always go through memory.
Thank you very much. Richard.