[Bug target/64897] New: Floating-point "and" not optimized on x86-64

schnetter at gmail dot com Sun, 01 Feb 2015 11:47:35 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64897


            Bug ID: 64897
           Summary: Floating-point "and" not optimized on x86-64
           Product: gcc
           Version: 4.9.2
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: schnetter at gmail dot com

I notice that gcc does not generate "vandpd" for floating-point "and"
operations. Here is an example code that demonstrates this:
{{{
#include <math.h>
#include <string.h>
double fand1(double x)
{
  unsigned long ix;
  memcpy(&ix, &x, 8);
  ix &= 0x7fffffffffffffffUL;
  memcpy(&x, &ix, 8);
  return x;
}
double fand2(double x)
{
  return fabs(x);
}
}}}

When I compile this via:
{{{
gcc-mp-4.9 -O3 -march=native -S fand.c -o fand-gcc-4.9.s
}}}
(OS X, x86-64 CPU, Intel Core i7), this results in:
{{{
_fand1:
    movabsq    $9223372036854775807, %rax
    vmovd    %xmm0, %rdx
    andq    %rdx, %rax
    vmovd    %rax, %xmm0
    ret
_fand2:
    vmovsd    LC1(%rip), %xmm1
    vandpd    %xmm1, %xmm0, %xmm0
    ret
}}}

This shows that (a) gcc performs the bitwise and operation in an integer
register, which is probably slower, and (b) the implementors of "fabs" assume
that using the "vandpd" instruction is faster.

[Bug target/64897] New: Floating-point "and" not optimized on x86-64

Reply via email to