------- Comment #3 from jb at gcc dot gnu dot org 2007-04-29 20:19 -------
Hmm, try e.g. the following (with gfortran the fixed sized arrays can use
builtin_memset whereas the allocatable arrays are done with a loop. Save as
.F90 (capital F) to force preprocessing to set the SZ macro.
! Test performance of different memset implementations
! compile with: gfortran -DSZ=36 memset.F90
! to test with array size 36 etc.
program testmemset
implicit none
real :: r(SZ)
complex :: c(SZ)
integer :: i(SZ)
real, allocatable :: ra(:)
complex, allocatable :: ca(:)
integer, allocatable :: ia(:)
real :: tr, tra, tc, tca, ti, tia, te
integer :: n, ni
n = SZ
ni = 1000000000/SZ
write (*,*) 'Array size = ', n, ', Doing ', ni, 'iterations.'
allocate (ra(n), ca(n), ia(n))
call cpu_time (tr)
do n = 2, ni
r = 0.0
end do
call cpu_time (te)
tr = te - tr
call cpu_time (tra)
do n = 2, ni
ra = 0.0
end do
call cpu_time (te)
tra = te - tra
call cpu_time (tc)
do n = 2, ni
c = (0.0, 0.0)
end do
call cpu_time (te)
tc = te - tc
call cpu_time (tca)
do n = 2, ni
ca = (0.0, 0.0)
end do
call cpu_time (te)
tca = te - tca
call cpu_time (ti)
do n = 2, ni
i = 0
end do
call cpu_time (te)
ti = te - ti
call cpu_time (tia)
do n = 2, ni
ia = 0
end do
call cpu_time (te)
tia = te - tia
write (*,*) 'real complex int'
write (*,*) '======================'
write (*,*) 'With memset:'
write (*,'(3(F15.8))') tr, tc, ti
write (*,*) 'With loop:'
write (*,'(3(F15.8))') tra, tca, tia
end program testmemset
On my athlon64, with options "-O3 -funroll-loops -march=athlon64 -mfpmath=sse
-ftree-vectorize -ffast-math -fdefault-real-8"
Array size = 36 , Doing 27777777 iterations.
real complex int
======================
With memset:
1.65610300 1.53209600 0.61603800
With loop:
0.78804900 1.78411100 0.52403300
Without -fdefault-real-8, e.g. single precision:
Array size = 36 , Doing 27777777 iterations.
real complex int
======================
With memset:
1.12006998 0.92005789 0.61603904
With loop:
0.49203002 1.43208909 0.50803185
(Yeah, should be more iterations to get better timings..)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31750