Hi, all,
  I built the latest Lapack package which latest gfortran, and use
profiling to see the hotspot, and found that the final binary spent a
lot of time on the libm-2.7.so, after binary searching, i found that
for the following source code(i attached the .f file):
(1) source code snippets: gfortran -O3 zlange.f -S
DOUBLE PRECISION FUNCTION ZLANGE( NORM, M, N, A, LDA, WORK )
         DO 40 J = 1, N
            SUM = ZERO
            DO 30 I = 1, M
               SUM = SUM + ABS( A( I, J ) )
   30       CONTINUE
            VALUE = MAX( VALUE, SUM )
   40    CONTINUE

(2) assembly code which generated by gfortran:
[EMAIL PROTECTED]:~/math/lapack-
gfortran/SRC$ gfortran -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../trunk/configure
--prefix=/home/tianwei/gcc/trunk-install
--enable-languages=c,c++,fortran --disable-multilib
--disable-bootstrap
Thread model: posix
gcc version 4.4.0 20081106 (experimental) (GCC)
.L10:
        movsd   (%rbx), %xmm0
        movsd   8(%rbx), %xmm1
        movsd   %xmm2, 16(%rsp)
        call    cabs

(3)
     the cabs defintion in libm-2.7.so:
[EMAIL PROTECTED]:~/math/lapack-gfortran/SRC$ readelf -s
/lib/libm-2.7.so | grep cabs
    72: 0000000000030150    23 FUNC    WEAK   DEFAULT   12 cabsf@@GLIBC_2.2.5
    78: 0000000000038310     5 FUNC    WEAK   DEFAULT   12 cabsl@@GLIBC_2.2.5
   231: 0000000000025110     5 FUNC    WEAK   DEFAULT   12 cabs@@GLIBC_2.2.5

(4)  the performance gap for this intrinsic is about 15% on my X86
Core2 desktop compared to other compiler.

(5) i search the web, and found that gfortran should support this
intrinsic, anyone can give me some suggestions for this problem?


Thanks very much.

Tianwei


--
Sheng, Tianwei
Inst. of High Performance Computing
Dept. of Computer Sci. & Tech.
Tsinghua Univ.

Reply via email to