------- Comment #2 from burnus at gcc dot gnu dot org  2009-11-20 14:20 -------
(In reply to comment #0)
> I think that ‘forall’ statement must be at least as fast as equivalent
> ‘do-…-end do’ construction.

The Fortran standardization committee thought likewise, however, as it turned
out in practice, it is sometimes not trivial for the compiler to see whether
there is any dependence on the RHS (right-hand side) with regards to the LHS
and thus it might use a temporary array even if none is needed - and temporary
arrays are slow (and memory hungry).

Thus, a DO loop should be always faster or as fast as a FORALL (assignment)
statement (unless, one does something really stupid in the DO loop).

[At least that is what I gathered from the comments at comp.lang.fortran and
which matches my knowledge regarding how it is done in gfortran.]

Having said that, gfortran still should try to make your program as fast for
FORALL as it is for the DO loop.

> But the next program (variant of LU-decomposition) shows that fragment
> containing ‘forall’ statement is approximately at 2.5(!) times slower then
> fragment with ‘do-end do’.

You could check using  -fdump-tree-original  how the two versions are handled;
my guess is that the FORALL version uses a temporary array.
(-fdump-tree-original  creates a <file.f90>.004* which contains a dump of the
internal representation of your code, which looks similar to C.)

Seemingly, Richard already looked at the dump and confirmed my suspicion.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42118

Reply via email to