Hi, all, I built the latest Lapack package which latest gfortran, and use profiling to see the hotspot, and found that the final binary spent a lot of time on the libm-2.7.so, after binary searching, i found that for the following source code(i attached the .f file): (1) source code snippets: gfortran -O3 zlange.f -S DOUBLE PRECISION FUNCTION ZLANGE( NORM, M, N, A, LDA, WORK ) DO 40 J = 1, N SUM = ZERO DO 30 I = 1, M SUM = SUM + ABS( A( I, J ) ) 30 CONTINUE VALUE = MAX( VALUE, SUM ) 40 CONTINUE
(2) assembly code which generated by gfortran: [EMAIL PROTECTED]:~/math/lapack- gfortran/SRC$ gfortran -v Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: ../trunk/configure --prefix=/home/tianwei/gcc/trunk-install --enable-languages=c,c++,fortran --disable-multilib --disable-bootstrap Thread model: posix gcc version 4.4.0 20081106 (experimental) (GCC) .L10: movsd (%rbx), %xmm0 movsd 8(%rbx), %xmm1 movsd %xmm2, 16(%rsp) call cabs (3) the cabs defintion in libm-2.7.so: [EMAIL PROTECTED]:~/math/lapack-gfortran/SRC$ readelf -s /lib/libm-2.7.so | grep cabs 72: 0000000000030150 23 FUNC WEAK DEFAULT 12 cabsf@@GLIBC_2.2.5 78: 0000000000038310 5 FUNC WEAK DEFAULT 12 cabsl@@GLIBC_2.2.5 231: 0000000000025110 5 FUNC WEAK DEFAULT 12 cabs@@GLIBC_2.2.5 (4) the performance gap for this intrinsic is about 15% on my X86 Core2 desktop compared to other compiler. (5) i search the web, and found that gfortran should support this intrinsic, anyone can give me some suggestions for this problem? Thanks very much. Tianwei -- Sheng, Tianwei Inst. of High Performance Computing Dept. of Computer Sci. & Tech. Tsinghua Univ.