https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111844

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jamborm at gcc dot gnu.org,
                   |                            |rguenth at gcc dot gnu.org

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
We are not optimizing the code at all on the GIMPLE level but expand from

  <bb 2> [local count: 1073741824]:
  memcpy (&p, buf_5(D), 88);
  _1 = p.x;
  inc.0_2 = (unsigned int) inc_7(D);
  _3 = _1 + inc.0_2;
  p.x = _3;
  memcpy (buf_5(D), &p, 88);
  p ={v} {CLOBBER(eol)};
  return;

where when expanding memcpy inline during RTL expanding we seem to be able to
clean up after that.

It seems to me this is a task for SRA (again...) which should be more
forgiving to select stmts requiring address-taking of locals but only
when they are not rewritten plus analyzing memcpy, memset (and other
select builtins) as to their effect.

SRA handles the following by means of totally scalarizing 'p':

void foo(P* buf, int inc) {
    P p;
    p = *buf;
    p.x += inc;
    *buf = p;
}

and you get

_Z3fooP1Pi:
.LFB16:
        .cfi_startproc
        addl    %esi, (%rdi)
        ret

with or without the call to bar ().  You could argue more aggressive
"inline expanding" memcpy (to char[] = char[] in this case) would
be asked for but I think this might confuse SRA and I'm not sure we
apply the same costing as to whether to inline-expand the "memcpy"
at RTL expansion time.

Reply via email to