http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42118
Lionel GUEZ <ebay.20.tedlap at spamgourmet dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ebay.20.tedlap@spamgourmet. | |com --- Comment #6 from Lionel GUEZ <ebay.20.tedlap at spamgourmet dot com> --- There is also the problem of the order of indices in a forall. I guess this is in close relation to the comparison of do and forall. Consider the following test program : program test_forall implicit none integer, parameter:: n = 1000 integer i, j, k double precision S(n, n, n) forall (i = 1: n, j = 1: n, k = 1: n) S(i, j, k) = i * j * k print *, "ijk, sum(s) = ", sum(s) end program test_forall According to the Fortran standard, the order of indices in the forall header is of no consequence. So, in the above program, we should be able to write equivalently : forall (k = 1: n, j = 1: n, i = 1: n) S(i, j, k) = i * j * k There is no way for the writer of the program to predict which of the two versions should be faster. It is interesting to note that, with gfortran, the forall with kji is much slower, while the inverse is true with the NAG compiler (version 5.3). I think the two versions should have the same run time. I have actually tested the two versions of the program with four compilers : -- gfortran 4.4.6 with -O3 kji, sum(s) = 1.253753751250046E+017 real 1m32.511s user 1m22.342s sys 0m8.368s ijk, sum(s) = 1.253753751250046E+017 real 0m12.962s user 0m7.416s sys 0m5.427s -- nagfor 5.3 with -O4 kji, sum(s) = 1.2537537512500458E+17 real 0m13.396s user 0m6.833s sys 0m6.054s ijk, sum(s) = 1.2537537512500458E+17 real 2m37.943s user 2m27.723s sys 0m7.873s -- pgf95 11.10 with -fast kji, sum(s) = 1.2537537512499998E+017 real 0m12.119s user 0m6.051s sys 0m5.910s ijk, sum(s) = 1.2537537512499998E+017 real 0m11.979s user 0m5.854s sys 0m5.939s -- ifort 12.1 with -O3 : kji, sum(s) = 1.253753751250000E+017 real 0m5.210s user 0m3.028s sys 0m2.150s ijk, sum(s) = 1.253753751250000E+017 real 0m5.114s user 0m2.981s sys 0m2.115s So we see that PG Fortran and Intel Fortran behave well : the two versions take about the same time. Also Intel Fortran is much faster than other compilers on this test. I would also like to comment on the use of the forall. Tobias Burnus says that improving the forall in Gfortran is not worth the effort. I think the forall is useful. It is an elegant way to write some assignments. There is no idea of time sequence in a forall and the forall can only contain an assignement while, as you know, the do construct could contain call to subroutines, input-output, recursive computations, anything. So when one reads a program and sees the forall it is much more quickly clear to understand what is going on than when one reads a do loop. Also the fact that assignments are independent (comment of Harald Anlauf) should make it easier for the compiler to produce a fast code.