http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53590

             Bug #: 53590
           Summary: new compiler generates both SISD and SIMD instructions
                    for parallel operations of a "pure" function
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: ada
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: bauh...@futureapps.de


The Ada compiler of 4.8.0 on Intel will either
- produce 2 DIVSD and also 1 DIVPD (redundance), or
- produce 2 DIVSD (no measurable parallelism)
for the following function f that divides two pairs
of 64 bit FPT types and returns one pair.
The expectation is just 1 DIVPD.
Other versions of the Ada compiler, and the C compiler,
and the C++ compiler of the same version produce
1 DIVPD instruction, as expected.

The line marked --! below switches between the two cases
listed above.

package Autovect is

   pragma Pure (Autovect);

   type Fpt is new Long_Float;

   type Vec is array (0 .. 1) of Fpt;
   --pragma Machine_Attribute (Vec, "vector_type");  --!
   --pragma Machine_Attribute (Vec, "may_alias");  

   function F (X0, X1, Y0, Y1 : Fpt) return Vec;

private
   pragma Assert (Fpt'Size = 64);
   pragma Assert (Vec'Alignment = 16);
end Autovect;

package body Autovect is

   function F (X0, X1, Y0, Y1 : Fpt) return Vec is
   begin
      return (X0 / Y0, X1 / Y1);
   end F;

end Autovect;

Result:

0000000000000010 <autovect__f>:
  10:    66 0f 28 e8              movapd %xmm0,%xmm5
  14:    66 0f 28 f2              movapd %xmm2,%xmm6
  18:    66 0f 14 e9              unpcklpd %xmm1,%xmm5
  1c:    66 0f 14 f3              unpcklpd %xmm3,%xmm6
  20:    f2 0f 5e c2              divsd  %xmm2,%xmm0
  24:    66 0f 28 e5              movapd %xmm5,%xmm4
  28:    66 0f 5e e6              divpd  %xmm6,%xmm4
  2c:    f2 0f 5e cb              divsd  %xmm3,%xmm1
  30:    66 0f 29 64 24 e8        movapd %xmm4,-0x18(%rsp)
  36:    f2 0f 10 44 24 e8        movsd  -0x18(%rsp),%xmm0
  3c:    f2 0f 10 4c 24 f0        movsd  -0x10(%rsp),%xmm1
  42:    c3                       retq   
  43:    66 66 66 66 2e 0f 1f     data32 data32 data32 nopw
%cs:0x0(%rax,%rax,1)
  4a:    84 00 00 00 00 00 

With "vector_type" machine attribute:

0000000000000010 <autovect__f>:
  10:    f2 0f 5e c2              divsd  %xmm2,%xmm0
  14:    f2 0f 5e cb              divsd  %xmm3,%xmm1
  18:    66 0f 14 c1              unpcklpd %xmm1,%xmm0
  1c:    c3                       retq   
  1d:    0f 1f 00                 nopl   (%rax)


gnatchop -r -w autovect.ada && gnatmake -gnatwa -W -O3 -fno-inline
-fomit-frame-pointer -msse3 -march=core2 -gnatp -gnata -v autovect.adb
splitting autovect.ada into:
   autovect.ads
   autovect.adb

GNATMAKE 4.8.0 20120525 (experimental)
Copyright (C) 1995-2012, Free Software Foundation, Inc.
  "autovect.ali" being checked ...
  -> "autovect.ads" time stamp mismatch
gcc -c -gnatwa -W -O3 -fno-inline -fomit-frame-pointer -msse3 -march=core2
-gnatp -gnata autovect.adb
End of compilation

Linux newnews 3.2.0-24-generic #39-Ubuntu SMP Mon May 21 16:52:17 UTC 2012
x86_64 x86_64 x86_64 GNU/Linux

Intel E6750

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/bauhaus/mine/libexec/gcc/x86_64-unknown-linux-gnu/4.8.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: /home/bauhaus/src/gcc/trunk/configure
--prefix=/home/bauhaus/mine --disable-nls --disable-libstdcxx-pch
--enable-languages=c,ada,c++
Thread model: posix
gcc version 4.8.0 20120525 (experimental) (GCC)

Reply via email to