On 09/04/2009 05:04 PM, FX wrote: > -- it's unarguably a glibc issue: if exp() is fast and expf() is > slow, why doesn't glibc implement expf() by calling exp()? (yes, there > can be other issues like rounding or so, but they can also be dealt > with separately)
If I recall correctly, it is mostly an x86-64 problem. AMD has some math patches for GLIBC which speed things up a lot. I think those are used in openSUSE/SLES but not in Fedora. On the other hand, the AMD patches have a problem with signaling NaN, which is being fixed [1,4]. Some older timings (from PR 34128) on openSUSE (!) -- for "sin" but there is the same problem as for exp: g77 gfortran -m32 real(4) 0.408s 0.421s -m64 real(4) 1.040s 0.589s ! sinf on x86-64: 40% faster! -m32 real(8) 0.411s 0.408s -m64 real(8) 0.976s 0.968s ! sin on x86-64 As this is a math-library problem, one cannot do much from the GCC/gfortran side. You could consider using the AMD Math Core Library [2] which implements fast versions of the trigonometric functions and exp [3]. Those functions are not fully IEEE compliant but it might not be needed in your case [3,4]. (See AMCL manual [3] for the details.) Intel's MKL should have something similar if you are on Intel hardware and have by chance the library. Switching to SUSE or applying the patches oneself is another possibility. (I do not know why the patches are not included in the upstream version of glibc. There must be some (somewhat) well-founded reason.) > -- a similar bug was already reported a year and a half ago, and no > activity was recorded on that front > (http://sources.redhat.com/bugzilla/show_bug.cgi?id=5997); Well, it is assigned to someone @suse and as written it is not an issue on openSUSE. It might be also related to the AMD patches and the reason why they are not included in GLIBC. Tobias (who uses openSUSE [11.1/Factory] at home, Fedora [version 6 (!)] at work) [1] For sNaN, see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39314 and https://bugzilla.novell.com/show_bug.cgi?id=487576 [2] ACML (free as in free beer): http://www.amd.com/acml [3] http://developer.amd.com/cpu/Libraries/acml/onlinehelp/Documents/Simple.html ; the functions are prefixed by "fast" but if you include the library before the math library ("-lm") the fast version is used instead of the libm version; "-lm" is automatically appended (internally) at the end of the command line when using "gfortran" thus simply adding "-lacml_mv" (or was it -lacml ?) to the command line should be sufficient. [4] Note, the GLIBC patches of AMD are supposed to be fully IEEE complient while the fastexp etc. of ACML are not (esp. regarding denormal numbers and signaling NaN.)