http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60048
Bug ID: 60048
Summary: scan-assembler results depend on '--with-arch='
Product: gcc
Version: 4.9.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: testsuite
Assignee: unassigned at gcc dot gnu.org
Reporter: [email protected]
building gcc with '--with-arch=bdver2' on x86_64 (--enable-multilib=no)
shows 150 additional errors with scan-asembler-*.
I've picked out a specific example to show the problem:
FAIL: gcc.target/i386/avx2-vpand-1.c scan-assembler vpand[
\\\\t[^\\n]*%ymm[0-9]
Test is based on avx2-vpand-1.s built with -mavx2 in testsuite.
Most likely the enabled math-extensions for arch bdver2 (FMA/AVX/XOP/...)
results which doesn't match the expectations of scan-assembler.
The original command-line for this specific test:
/home/winfried/gcc-svn/winni/gcc/xgcc -B/home/winfried/gcc-svn/winni/gcc/
/home/winfried/gcc-svn/gcc/gcc/testsuite/gcc.target/i386/avx2-vpand-1.c
-fno-diagnostics-show-caret -fdiagnostics-color=never -mavx2 -O2
-ffat-lto-objects -S -o avx2-vpand-1.s
Using '-march=x86-64 -mavx2' results in the following diff:
# diff -u avx2-vpand-1.s avx2-vpand-1-x86-64.s
--- avx2-vpand-1.s 2014-02-03 22:55:55.985731285 +0100
+++ avx2-vpand-1-x86-64.s 2014-02-03 22:51:51.126848721 +0100
@@ -3,17 +3,16 @@
.LCOLDB0:
.text
.LHOTB0:
- .p2align 4,,10
- .p2align 3
+ .p2align 4,,15
.globl avx2_test
.type avx2_test, @function
avx2_test:
.LFB2209:
.cfi_startproc
- vmovaps x(%rip), %ymm1
- vmovaps x(%rip), %ymm0
- vandps %ymm1, %ymm0, %ymm0
- vmovaps %ymm0, x(%rip)
+ vmovdqa x(%rip), %ymm1
+ vmovdqa x(%rip), %ymm0
+ vpand %ymm1, %ymm0, %ymm0
+ vmovdqa %ymm0, x(%rip)
vzeroupper
ret
.cfi_endproc
the output in the second file (vx2-vpand-1-x86-64.s) build with '-march=x86-64'
has the expected result for scan-asembler: vpand %ymm1, %ymm0, %ymm0)
The testsuite-problem is present in gcc-4.8.x and current trunk (most likely in
gcc-4.7 too). I don't know if it's possible to fix this problem with current
test-environment because '-march' has to be set to a base-value based on
architecture.
But if the test is for specific math-extensions then it sounds
like a good idea to suppress other math-extensions which might lead to
unexpected asembler-output.
best regards
winfried