Toon Moene wrote:
REAL, ALLOCATABLE :: A(:,:), B(:,:), C(:,:), D(:,:), E(:,:), F(:,:)
! ... READ IN EXTEND OF ARRAYS ...
READ*,N
! ... ALLOCATE ARRAYS
ALLOCATE(A(N,N),B(N,N),C(N,N),D(N,N),E(N,N),F(N,N))
! ... READ IN ARRAYS
READ*,A,B
C = A + B
D = A * C
E = B * EXP(D)
F = C * LOG(E)
where the four assignments all have the structure of loops like:
DO I = 1, N
DO J = 1, N
X(J,I) = OP(A(J,I), B(J,I))
ENDDO
ENDDO
Obviously, this could benefit from loop fusion, by combining the four
assignments in one loop.
Provided that it were still possible to vectorize suitable portions, or
N is known to be so large that cache locality outweighs vectorization.
This raises the question of progress on vector math functions, as well
as the one about relative alignments (or ignoring them in view of recent
CPU designs).