See also: http://gcc.gnu.org/ml/fortran/2006-12/msg00000.html
Tim Prince's analysis why ifort needs 8.77s and gfortran 12.16s for one program: "My copy of gfortran makes separate scalar calls to sin and cos, where ifort makes a vector sincos call. 2 seconds to be gained there" In the Fortran program below, a there is a call to the same argument with sin and with cos. The resulting gfortran -msse3 -march=opteron -ffast-math -O3 -c -fdump-tree-optimized shows pretmp.56 = __builtin_sinf (pretmp.80); pretmp.86 = __builtin_cosf (pretmp.80); Using sincos should be faster. Fortran program: subroutine test(number_of_sample_points,radius,coefficient,radius_of_curvature,spin_frequency,time,tmp) implicit none integer :: number_of_sample_points real, parameter :: twopi = 6.28319 integer :: n real :: radius(number_of_sample_points), coefficient, radius_of_curvature, & spin_frequency,time,tmp(2,number_of_sample_points) do n = 1, number_of_sample_points coefficient = (radius(n) / radius_of_curvature) * sin(twopi * & spin_frequency * time) tmp(1,n) = coefficient coefficient = twopi * spin_frequency * (radius(n) / & radius_of_curvature) * cos(twopi * & spin_frequency * time) tmp(2,n) = coefficient end do end subroutine -- Summary: Call to sin(x), cos(x) should be transformed to sincos(x) Product: gcc Version: 4.3.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: burnus at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30038