465.tonto in one of its hot loops does essentially what the following reduced testcase does:
subroutine make_esss(esss,Ix,Iyz,e_x,ii_ivec) real(kind=kind(1.0d0)), dimension(:), intent(inout) :: esss real(kind=kind(1.0d0)), dimension(:,:), pointer :: Ix,Iyz integer(kind=kind(1)), dimension(:), pointer :: e_x,ii_ivec esss(:) = esss(:) + sum(Ix(:,e_x) * Iyz(:,ii_ivec), 1) end subroutine this is scalarized by the frontend to atmp4 = e_x atmp6 = ii_ivec atmp8 = Ix(:,atmp4) * Iyz(:,atmp6) atmp11 = sum (atmp8, 1) ess = ess + atmp11 where the sum is not inline-expanded. 1) the temporaries for e_x and ii_ivec are not necessary 2) the sum can easily be inline-expanded as the shape of atmp8 is well-defined 3) we can avoid atmp8 by expanding sum(Ix(:,e_x), Iyz(:ii_ivec), 1) together like atmp11(z) = 0 do z=1,size(Ix,1) atmp11(z) = atmp11(z) + Ix(z,e_x(e)) * Iyz(z,ii_ivec(e)) end do or even avoid atmp11 alltogether and expand to do e=1,size(esss,1) tem = 0 do z=1,size(Ix,1) tem = tem + Ix(z,e_x(e)) * Iyz(z,ii_ivec(e)) end do esss(e) = esss(e) + tem end do given that esss does not have the target attribute and thus cannot be aliased by e_x or ii_ivec. -- Summary: Scalarization of reductions Product: gcc Version: 4.6.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: fortran AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rguenth at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43829