[Bug c++/85057] GCC fails to vectorize code unless dummy loop is added

mokreutzer at gmail dot com Tue, 27 Mar 2018 05:37:39 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85057


--- Comment #7 from Moritz Kreutzer <mokreutzer at gmail dot com> ---
(In reply to Richard Biener from comment #6)
> I didn't try to see why but I guess "bad luck" ;)  It probably makes
> the first access a pointer one as well.

Okay, in that case I'd rather call it "good luck" :)

> OK, so looking closer we have after early optimization:
> 
>   <bb 6> [99.00%]:
>   _2 = &a[i_5];
>   _13 = MEM[(const Type_t &)_2].v[0];
>   _14 = _13 * 5.0e-1;
>   MEM[(double &)_2] = _14;
> 
> but then later forwprop is "lucky" to propagate _2 into just one of the
> dereferences.  Note that propagating into both wouldn't help because
> the accesses do not have a similar structure -- one accesses a[i].v[0]
> while the other accesses a[i] as if it were a 'double'.  That seems to be
> 
>   _2 = &a[i_7];
>   D.39137 = 5.0e-1;
>   D.39483 = operator*<double, 1, double> (&D.39137, _2); [return slot
> optimization]
> 
> vs.
> 
>   _3 = &a[i_7];
>   Vector<1,
> double>::operator=<Pete::Expression<Pete::BinaryNode<Pete::OpMultiply,
> Pete::Scalar<double>, Pete::Reference<Vector<1, double> > > > > (_3,
> &D.39483);
> 
> so somehow LHS vs. RHS evaluation goes a different path.  Not sure if that's
> avoidable (it's been some time since I worked with PETE).

I'll try to have a look into PETE to see whether we this can be avoided.
Otherwise, I'll just keep the dummy loop: It helps GCC to vectorize the code
and otherwise, it should just be ignored by any compiler. So I guess it should
at least do no harm.

> Yeah, that one looks like the same issue.  Whether it's easy or not easy
> to fix remains to be seen - it's mostly a matter of priority...

Okay, I'll stay in the loop. Thanks for your prompt reply and for your help!

[Bug c++/85057] GCC fails to vectorize code unless dummy loop is added

Reply via email to