465.tonto in one of its hot loops does essentially what the following reduced
testcase does:

subroutine make_esss(esss,Ix,Iyz,e_x,ii_ivec)
  real(kind=kind(1.0d0)), dimension(:), intent(inout) :: esss
  real(kind=kind(1.0d0)), dimension(:,:), pointer :: Ix,Iyz
  integer(kind=kind(1)), dimension(:), pointer  :: e_x,ii_ivec

  esss(:) = esss(:) + sum(Ix(:,e_x) * Iyz(:,ii_ivec), 1)

end subroutine

this is scalarized by the frontend to

  atmp4 = e_x
  atmp6 = ii_ivec
  atmp8 = Ix(:,atmp4) * Iyz(:,atmp6)
  atmp11 = sum (atmp8, 1)
  ess = ess + atmp11

where the sum is not inline-expanded.

1) the temporaries for e_x and ii_ivec are not necessary
2) the sum can easily be inline-expanded as the shape of atmp8 is well-defined
3) we can avoid atmp8 by expanding sum(Ix(:,e_x), Iyz(:ii_ivec), 1) together
   like

                atmp11(z) = 0
                do z=1,size(Ix,1)
                  atmp11(z) = atmp11(z) + Ix(z,e_x(e)) * Iyz(z,ii_ivec(e))
                end do

   or even avoid atmp11 alltogether and expand to

              do e=1,size(esss,1)
                tem = 0
                do z=1,size(Ix,1)
                  tem = tem + Ix(z,e_x(e)) * Iyz(z,ii_ivec(e))
                end do
                esss(e) = esss(e) + tem
              end do

   given that esss does not have the target attribute and thus cannot
   be aliased by e_x or ii_ivec.


-- 
           Summary: Scalarization of reductions
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: fortran
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: rguenth at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43829

Reply via email to