------- Comment #4 from spop at gcc dot gnu dot org 2010-01-15 01:20 ------- The problem here is that the loop invariant motion moves rt(i,j) into a temporary outside the innermost loop:
real*8 rt(6,6),r(6,6),rtt(6,6) do i=1,6 do j=1,6 t = rt(i,j) do ia=1,6 rtt(i,ia)=t*r(j,ia)+rtt(i,ia) end do end do end do and then we get the cleanup before graphite translating this into an array: do i=1,6 do j=1,6 cross_bb[0] = rt(i,j) do ia=1,6 rtt(i,ia)=cross_bb[0]*r(j,ia)+rtt(i,ia) end do end do end do Then the loop interchange would ask for loop distribution when it considers the loops 'j' and 'ia', and from the original LST we get: original_lst ( (root 0 (loop 0 (loop 0 stmt_4 1 (loop 0 stmt_5))))) transformed_lst ( (root 0 (loop 1 0 (loop 2 0 stmt_4) 1 (loop 3 0 (loop 4 0 stmt_5))))) that is then validated as "legal" by the graphite_legal_transform. The problem seems to be in the build_lexicographically_gt_constraint that does not add the information "first instance of stmt_5 is executed after the last instance of stmt_4 in loop 2". We would have then a write into cross_bb[0] for all the iterations of loop 2: cross_bb[0] = rt(i,0) cross_bb[0] = rt(i,1) cross_bb[0] = rt(i,2) cross_bb[0] = rt(i,3) cross_bb[0] = rt(i,4) cross_bb[0] = rt(i,5) and then only we would read the value of cross_bb[0] in stmt_5: = cross_bb[0] * ... = cross_bb[0] * ... = cross_bb[0] * ... = cross_bb[0] * ... = cross_bb[0] * ... = cross_bb[0] * ... In the original program we would have had these writes and reads interleaved like this: cross_bb[0] = rt(i,0) = cross_bb[0] * ... cross_bb[0] = rt(i,1) = cross_bb[0] * ... cross_bb[0] = rt(i,2) = cross_bb[0] * ... cross_bb[0] = rt(i,3) = cross_bb[0] * ... cross_bb[0] = rt(i,4) = cross_bb[0] * ... cross_bb[0] = rt(i,5) = cross_bb[0] * ... Konrad could you have a look at build_lexicographically_gt_constraint? Thanks, Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42637