https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96966

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #4)
> Even
> extern char a[32];
> 
> void f (const void *s)
> {
>   char *p = (char*)__builtin_memcpy (a, s, 16);
>   __builtin_memcpy (p, s, 16);
> }
> 
> 
> void g (const void *s)
> {
>   __builtin_memcpy (a, s, 16);
>   __builtin_memcpy (a, s, 16);
> }
> used to be optimized just in 8.1/8.2 and not in earlier or later GCC
> versions.
> Perhaps delaying the lowering of memcpy a tiny bit and trying to optimize it
> when it is still not lowered?

lowering early is quite an important thing since it enables better initial
into-SSA rewriting and early inline costing.  Note even w/o lowering FRE
would not optimize this.  Even the strlen pass doesn't:

extern char a[32];
void __GIMPLE (ssa,startwith("fre1")) g (const void *s)
 {
__BB(2):
   __builtin_memcpy (&a[0], s_1(D), _Literal (__SIZE_TYPE__) 16);
   __builtin_memcpy (&a[0], s_1(D), _Literal (__SIZE_TYPE__) 16);
   return;
 }

has this survive until .fab if you do -O2 -fno-tree-forwprop -fno-tree-vrp

Reply via email to