> Hi!
> 
> The following testcase ICEs, because we determine only in late pure const
> pass that bar is const (the content of the function loses a store to a
> global var during dse3 and read from it during cddce2) and local-pure-const2
> makes it const.  The cgraph ordering is that post IPA (in late IPA simd
> clones are created) bar is processed first, then foo as its caller, then
> foo.simdclone* and finally bar.simdclone*.  Conceptually I think that is the
> right ordering which allows for static simd clones to be removed.
> 
> The reason for the ICE is that because bar was marked const, the call to
> it lost vops before vectorization, and when we in foo.simdclone* try to
> vectorize the call to bar, we replace it with bar.simdclone* which hasn't
> been marked const and so needs vops, which we don't add.
> 
> Now, because the simd clones are created from the same IL, just in a loop
> with different argument/return value passing, I think generally if the base
> function is determined to be const or pure, the simd clones should be too,
> unless e.g. the vectorization causes different optimization decisions, but
> then still the global memory reads if any shouldn't affect what the function
> does and global memory stores shouldn't be reachable at runtime.

My understanding of simd clones is bit limited, but I think you are
right that they should have the same semantics as their caller.

I think const may be one that makes compiler to ICE, but
there are many other places where function body is analyzed and all its
aliases/thunks and other variants should be updated too.  For exmaple
set_pure_flag, nothrow, noreturn and analysis done by modref,
ipa-refernece etc.

I wonder if we want to update them all and hide that in some
abstraction?  Next stage 1 I can work on inventing iterators for those
kind of things as current approach combinindg direct walkters and
function wrappers has become bit hard to maintain in cases like this.

Honza
> 
> So, the following patch changes set_{const,pure}_flag to mark also simd
> clones.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2023-02-07  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR tree-optimization/106433
>       * cgraph.cc (set_const_flag_1): Recurse on simd clones too.
>       (cgraph_node::set_pure_flag): Call set_pure_flag_1 on simd clones too.
> 
>       * gcc.c-torture/compile/pr106433.c: New test.
> 
> --- gcc/cgraph.cc.jj  2023-02-02 10:54:44.327473492 +0100
> +++ gcc/cgraph.cc     2023-02-06 12:28:22.040593063 +0100
> @@ -2764,6 +2764,9 @@ set_const_flag_1 (cgraph_node *node, boo
>        if (!set_const || alias->get_availability () > AVAIL_INTERPOSABLE)
>       set_const_flag_1 (alias, set_const, looping, changed);
>      }
> +  for (struct cgraph_node *n = node->simd_clones; n != NULL;
> +       n = n->simdclone->next_clone)
> +    set_const_flag_1 (n, set_const, looping, changed);
>    for (cgraph_edge *e = node->callers; e; e = e->next_caller)
>      if (e->caller->thunk
>       && (!set_const || e->caller->get_availability () > AVAIL_INTERPOSABLE))
> @@ -2876,6 +2879,9 @@ cgraph_node::set_pure_flag (bool pure, b
>  {
>    struct set_pure_flag_info info = {pure, looping, false};
>    call_for_symbol_thunks_and_aliases (set_pure_flag_1, &info, !pure, true);
> +  for (struct cgraph_node *n = simd_clones; n != NULL;
> +       n = n->simdclone->next_clone)
> +    set_pure_flag_1 (n, &info);
>    return info.changed;
>  }
>  
> --- gcc/testsuite/gcc.c-torture/compile/pr106433.c.jj 2023-02-06 
> 12:37:26.963748811 +0100
> +++ gcc/testsuite/gcc.c-torture/compile/pr106433.c    2023-02-06 
> 12:37:06.631041918 +0100
> @@ -0,0 +1,24 @@
> +/* PR tree-optimization/106433 */
> +
> +int m, *p;
> +
> +__attribute__ ((simd)) int
> +bar (int x)
> +{
> +  if (x)
> +    {
> +      if (m < 1)
> +        for (m = 0; m < 1; ++m)
> +          ++x;
> +      p = &x;
> +      for (;;)
> +        ++m;
> +    }
> +  return 0;
> +}
> +
> +__attribute__ ((simd)) int
> +foo (int x)
> +{
> +  return bar (x);
> +}
> 
>       Jakub
> 

Reply via email to