[Bug fortran/42118] Slow forall

ebay.20.tedlap at spamgourmet dot com Tue, 08 Oct 2013 06:12:35 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42118


Lionel GUEZ <ebay.20.tedlap at spamgourmet dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ebay.20.tedlap@spamgourmet.
                   |                            |com

--- Comment #6 from Lionel GUEZ <ebay.20.tedlap at spamgourmet dot com> ---
There is also the problem of the order of indices in a forall. I guess this is
in close relation to the comparison of do and forall. Consider the following
test program :

program test_forall
  implicit none
  integer, parameter:: n = 1000
  integer i, j, k
  double precision S(n, n, n)
  forall (i = 1: n, j = 1: n, k = 1: n) S(i, j, k) = i * j * k
  print *, "ijk, sum(s) = ", sum(s)
end program test_forall

According to the Fortran standard, the order of indices in the forall header is
of no consequence. So, in the above program, we should be able to write
equivalently :

  forall (k = 1: n, j = 1: n, i = 1: n) S(i, j, k) = i * j * k

There is no way for the writer of the program to predict which of the two
versions should be faster. It is interesting to note that, with gfortran, the
forall with kji is much slower, while the inverse is true with the NAG compiler
(version 5.3). I think the two versions should have the same run time. I have
actually tested the two versions of the program with four compilers :

-- gfortran 4.4.6 with -O3

 kji, sum(s) =   1.253753751250046E+017

real    1m32.511s
user    1m22.342s
sys    0m8.368s

 ijk, sum(s) =   1.253753751250046E+017

real    0m12.962s
user    0m7.416s
sys    0m5.427s


-- nagfor 5.3 with -O4

 kji, sum(s) =    1.2537537512500458E+17

real    0m13.396s
user    0m6.833s
sys    0m6.054s

 ijk, sum(s) =    1.2537537512500458E+17

real    2m37.943s
user    2m27.723s
sys    0m7.873s


-- pgf95 11.10 with -fast 

 kji, sum(s) =    1.2537537512499998E+017

real    0m12.119s
user    0m6.051s
sys    0m5.910s

 ijk, sum(s) =    1.2537537512499998E+017

real    0m11.979s
user    0m5.854s
sys    0m5.939s

-- ifort 12.1 with -O3 :

 kji, sum(s) =   1.253753751250000E+017

real    0m5.210s
user    0m3.028s
sys    0m2.150s

 ijk, sum(s) =   1.253753751250000E+017

real    0m5.114s
user    0m2.981s
sys    0m2.115s

So we see that PG Fortran and Intel Fortran behave well : the two versions take
about the same time. Also Intel Fortran is much faster than other compilers on
this test.

I would also like to comment on the use of the forall. Tobias Burnus says that
improving the forall in Gfortran is not worth the effort. I think the forall is
useful. It is an elegant way to write some assignments. There is no idea of
time sequence in a forall and the forall can only contain an assignement while,
as you know, the do construct could contain call to subroutines, input-output,
recursive computations, anything. So when one reads a program and sees the
forall it is much more quickly clear to understand what is going on than when
one reads a do loop. Also the fact that assignments are independent (comment of
Harald Anlauf) should make it easier for the compiler to produce a fast code.

[Bug fortran/42118] Slow forall

Reply via email to