Am 20.11.19 um 23:18 schrieb Janne Blomqvist:
On Wed, Nov 20, 2019 at 11:35 PM Thomas König <t...@tkoenig.net> wrote:

Am 20.11.19 um 21:45 schrieb Janne Blomqvist:
BTW, since this is done for the purpose of optimization, have you done
testing on some suitable benchmark suite such as polyhedron, whether
it a) generates any different code b) does it make it go faster?

I haven't run any actual benchmarks.

However, there is a simple example which shows its advantages.
Consider

        subroutine foo(n,m)
        m = 0
        do 100 i=1,100
          call bar
          m = m + n
   100  continue
        end

(I used old-style DO loops just because :-)

Without the optimization, the inner loop is translated to

.L2:
          xorl    %eax, %eax
          call    bar_
          movl    (%r12), %eax
          addl    %eax, 0(%rbp)
          subl    $1, %ebx
          jne     .L2

and with the optimization to

.L2:
          xorl    %eax, %eax
          call    bar_
          addl    %r12d, 0(%rbp)
          subl    $1, %ebx
          jne     .L2

so the load of the address is missing.  (Why do we zero %eax
before each call? It should not be a variadic call right?)

Not sure. Maybe some belt and suspenders thing? I guess someone better
versed in ABI minutiae knows better. It's not Fortran-specific though,
the C frontend does the same when calling a void function.

OK, so considering your other e-mail, this is a separate issue that
we can fix another time.

AFAIK on reasonably current OoO CPU's xor'ing a register with itself
is handled by the renamer and doesn't consume an execute slot, so it's
in effect a zero-cycle instruction. Still bloats the code slightly,
though.

Of course, Fortran language rules specify that the call to bar
cannot do anything to n

Hmm, does it? What about the following modification to your testcase:

module nmod
   integer :: n
end module nmod

subroutine foo(n,m)
   m = 0
   do 100 i=1,100
      call bar
      m = m + n
100  continue
end subroutine foo

subroutine bar()
   use nmod
   n = 0
end subroutine bar

program main
   use nmod
   implicit none
   integer :: m
   n = 1
   m = 0
   call foo(n, m)
   print *, m
end program main

That is not allowed:

# 15.5.2.13  Restrictions on entities associated with dummy arguments

[...]

# (3) Action that affects the value of the entity or any subobject of it
# shall be taken only through the dummy argument unless

[none of the restrictions apply].


So, a copy in / copy out for variables where we can not be sure that
no value is assigned?  Does anybody see a downside for that?)

In principle sounds good, unless my concerns above are real and affect
this case too.

So, how to proceed?  Commit the patch with the maximum length for a
mangled symbol, and then maybe try for the copy-out variant in a
follow-up patch?

I agree with Tobias that dealing with this in the middle end is probably
the right thing to do in the long run (especially since we could also
handle arrays and structs this way). Until we get around to doing this
(gcc 11 at earliest), we could still profit somewhat from this
optimization in the meantime.

Regards

        Thomas


Reply via email to