[Bug rtl-optimization/69710] performance issue with SP Linpack with Autovectorization

amker at gcc dot gnu.org Sun, 14 Feb 2016 11:54:07 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69710


--- Comment #11 from amker at gcc dot gnu.org ---
(In reply to Doug Gilmore from comment #10)
> Created attachment 37681 [details]
> prototype fix
> 
> > 1) we failed recognize that use 0 and 2 are identical to each other.
> > This is because vectorizer generates redundant setup code in loop
> > pre-header.  There are two possible fixes here.  One is to make
> > expand_simple_operations more aggressive in expanding (used by
> > ivopts) in tree-ssa-loop-niter.c.  But I don't think this is a good
> > idea in all cases, because expanded complicated expression makes ivo
> > transform and niter analysis harder.
> Or something along the lines of the attached patch, tested only on
> the on the problem at hand.   As it stands it is probably to heavy
> handed to consider as a possible review candidate.
Yes, I proposed some cleanup passess after vectorization but richi thinks it's
genrally expensive.  So what's implmentation complexity of pass_dominator?


> > The other is to fix vectorizer
> > to generate clean code.  Richard's suggestion is to use gimple_build
> > for that.
> ISTM to be the reasonable approach but I haven't yet investigated
> what's involved.
> > Also the problem exists only for arm because it doesn't support
> > [base+index] addressing mode for vect load/store.  I guess mips
> > doesn't either.
> > 
> Right MIPS MSA doesn't support [base+index] mode.
> 
> BTW, the reason why IVOPTS works for DP but not SP on MIPS MSA is
> that the code in the pre-header is simpler for DP:
> 
>   <bb 6>:
>   vect_cst__52 = {da_6(D), da_6(D)};
> 
>   <bb 7>:
>   # vectp_dy.8_46 = PHI <dy_9(D)(6), vectp_dy.8_47(12)>
>   # vectp_dx.11_49 = PHI <dx_13(D)(6), vectp_dx.11_50(12)>
>   # vectp_dy.16_55 = PHI <dy_9(D)(6), vectp_dy.16_56(12)>
>   # ivtmp_58 = PHI <0(6), ivtmp_59(12)>
> ...
> which IVOPS can handle.
Ah, so IMHO the code should be refined before IVO, we shouldn't put too much
pressure which is not directly ivo related on ivo transform.

[Bug rtl-optimization/69710] performance issue with SP Linpack with Autovectorization

Reply via email to