8 Regression] Incorrect results with lto and -fipa-cp and -fipa-cp-clone

rguenther at suse dot de Fri, 18 Aug 2017 06:27:37 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81877


--- Comment #12 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 18 Aug 2017, amonakov at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81877
> 
> --- Comment #11 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
> (In reply to Richard Biener from comment #10)
> > Now - for refs that have an invariant address in such loop the interleaving
> > effectively means that they are independent even in the same iteration. 
> 
> Not if there's no "other iteration", i.e. runtime iteration count is 1.

Not sure about this, I interpret 'ivdep' as there's no dependence for
any number of iterations.

> [...]
> > for example.  I suppose it's not their intent.  So maybe there's an
> > additional
> > restriction on the interleaving?  Preserve iteration order of individual
> > stmts?  That would prevent autopar in face of just ivdep for example.
> > 
> > Note that any "handle must-defs 'correctly'" writing is inherently fishy.
> 
> I think you're saying that pragma-ivdep and do-concurrent are too hand-wavy
> about how the compiler may or must privatize variables, whether it must detect
> and handle reductions/inductions etc. But note that LIM is keying on 
> 'simdlen',
> and simdlen is also set by OpenMP-SIMD which is more rigorous in that regard,
> i.e. privatization is explicit in GIMPLE. And there I believe LIM does not 
> have
> the license to disregard may-alias relations *unless* it verifies that loop
> iterates at least twice and repeated writes are UB. On this example:

Yeah, the middle-end uses safelen which is also used for simdlen.  It has
to adhere to the most conservative definition.

> void g(int p, int *out)
> {
>   int x, y = 0, *r = p ? &x : &y;
>   unsigned n = 0;
>   asm("" : "+r" (n));
> #pragma omp simd
>   for (int i = 0; i <= n; i++)
>     {
> //#pragma omp ordered simd
>       x = 42;
>       out[i] = *r;
>     }
> }
> 
> I believe LSM is wrong for n=0, and for any n if the pragma-ordered is
> uncommented.

I see.  I wonder if we handle pramga-ordered correctly in vectorization
for say

#pragma omp simd
  for (int i = 0; i <= n; i++)
    {
#pragma omp ordered simd
      out[i+2] = 0;
      out[i+1] = 1;
      out[i] = 2;
    }

I believe we vectorize this with SLP and unrolling with VF 12 as

   out[i..i+3] = {2, 1, 0, 2};
   out[i+4..i+7] = {1, 0, 2, 1};
   out[i+8..i+11] = {0, 2, 1, 0};

I guess "at the same time" fulfils 'ordered' but does splitting
like above do?  That moves out[i+3] store before out[i+5].

The safest thing would be to remove safelen handling from invariant
motion.

More rigorously defining the semantic of loop->safelen (the
middle-end term) is necessary nevertheless.  I believe omp ordered
doesn't have any middle-end representation?

[Bug ipa/81877] [7/8 Regression] Incorrect results with lto and -fipa-cp and -fipa-cp-clone

Reply via email to