http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52034
Bug #: 52034
Summary: __builtin_copysign optimization suboptimal
Classification: Unclassified
Product: gcc
Version: 4.6.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: [email protected]
ReportedBy: [email protected]
The most trivial __builtin_copysign optimization is not optimal:
double f(double a, double b)
{
return __builtin_copysign(a,b);
}
With gcc 4.6.2 this gets compiled to
movapd %xmm1, %xmm2
andpd .LC0(%rip), %xmm0
andpd .LC1(%rip), %xmm2
orpd %xmm2, %xmm0
ret
There is no reason for %xmm1 to be duplicated to %xmm2. This is sufficient:
andpd .LC0(%rip), %xmm0
andpd .LC1(%rip), %xmm1
orpd %xmm1, %xmm0
ret
The same happens with more complicated code sequences.