http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53590
Bug #: 53590 Summary: new compiler generates both SISD and SIMD instructions for parallel operations of a "pure" function Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: ada AssignedTo: unassig...@gcc.gnu.org ReportedBy: bauh...@futureapps.de The Ada compiler of 4.8.0 on Intel will either - produce 2 DIVSD and also 1 DIVPD (redundance), or - produce 2 DIVSD (no measurable parallelism) for the following function f that divides two pairs of 64 bit FPT types and returns one pair. The expectation is just 1 DIVPD. Other versions of the Ada compiler, and the C compiler, and the C++ compiler of the same version produce 1 DIVPD instruction, as expected. The line marked --! below switches between the two cases listed above. package Autovect is pragma Pure (Autovect); type Fpt is new Long_Float; type Vec is array (0 .. 1) of Fpt; --pragma Machine_Attribute (Vec, "vector_type"); --! --pragma Machine_Attribute (Vec, "may_alias"); function F (X0, X1, Y0, Y1 : Fpt) return Vec; private pragma Assert (Fpt'Size = 64); pragma Assert (Vec'Alignment = 16); end Autovect; package body Autovect is function F (X0, X1, Y0, Y1 : Fpt) return Vec is begin return (X0 / Y0, X1 / Y1); end F; end Autovect; Result: 0000000000000010 <autovect__f>: 10: 66 0f 28 e8 movapd %xmm0,%xmm5 14: 66 0f 28 f2 movapd %xmm2,%xmm6 18: 66 0f 14 e9 unpcklpd %xmm1,%xmm5 1c: 66 0f 14 f3 unpcklpd %xmm3,%xmm6 20: f2 0f 5e c2 divsd %xmm2,%xmm0 24: 66 0f 28 e5 movapd %xmm5,%xmm4 28: 66 0f 5e e6 divpd %xmm6,%xmm4 2c: f2 0f 5e cb divsd %xmm3,%xmm1 30: 66 0f 29 64 24 e8 movapd %xmm4,-0x18(%rsp) 36: f2 0f 10 44 24 e8 movsd -0x18(%rsp),%xmm0 3c: f2 0f 10 4c 24 f0 movsd -0x10(%rsp),%xmm1 42: c3 retq 43: 66 66 66 66 2e 0f 1f data32 data32 data32 nopw %cs:0x0(%rax,%rax,1) 4a: 84 00 00 00 00 00 With "vector_type" machine attribute: 0000000000000010 <autovect__f>: 10: f2 0f 5e c2 divsd %xmm2,%xmm0 14: f2 0f 5e cb divsd %xmm3,%xmm1 18: 66 0f 14 c1 unpcklpd %xmm1,%xmm0 1c: c3 retq 1d: 0f 1f 00 nopl (%rax) gnatchop -r -w autovect.ada && gnatmake -gnatwa -W -O3 -fno-inline -fomit-frame-pointer -msse3 -march=core2 -gnatp -gnata -v autovect.adb splitting autovect.ada into: autovect.ads autovect.adb GNATMAKE 4.8.0 20120525 (experimental) Copyright (C) 1995-2012, Free Software Foundation, Inc. "autovect.ali" being checked ... -> "autovect.ads" time stamp mismatch gcc -c -gnatwa -W -O3 -fno-inline -fomit-frame-pointer -msse3 -march=core2 -gnatp -gnata autovect.adb End of compilation Linux newnews 3.2.0-24-generic #39-Ubuntu SMP Mon May 21 16:52:17 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Intel E6750 $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/home/bauhaus/mine/libexec/gcc/x86_64-unknown-linux-gnu/4.8.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: /home/bauhaus/src/gcc/trunk/configure --prefix=/home/bauhaus/mine --disable-nls --disable-libstdcxx-pch --enable-languages=c,ada,c++ Thread model: posix gcc version 4.8.0 20120525 (experimental) (GCC)