On Thu, Jun 29, 2017 at 12:32 PM, Richard Sandiford <richard.sandif...@linaro.org> wrote: > Richard Biener <richard.guent...@gmail.com> writes: >> On Wed, Jun 28, 2017 at 3:36 PM, Richard Sandiford >> <richard.sandif...@linaro.org> wrote: >>> dr_analyze_innermost had a "struct loop *nest" parameter that acted >>> like a boolean. This was added in r179161, with the idea that a >>> null nest selected BB-level analysis rather than loop analysis. >>> >>> The handling seemed strange though. If the DR was part of a loop, >>> we still tried to express the base and offset values as IVs, potentially >>> giving a nonzero step. If that failed for any reason, we'd revert to >>> using the original base and offset, just as we would if we hadn't asked >>> for an IV in the first place. >>> >>> It seems more natural to use the !in_loop handling whenever nest is null >>> and always set the step to zero. This actually enables one more SLP >>> opportunity in bb-slp-pr65935.c. >>> >>> I checked out r179161 and tried the patch there. The test case added >>> in that revision still passes, so I don't think there was any particular >>> need to check simple_iv. >>> >>> Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? >> >> I have a few additional comments for consideration. I remembered code >> in vect_compute_data_ref_alignment explicitely looking at DR_STEP in >> BB mode: >> >> /* Similarly we can only use base and misalignment information relative to >> an innermost loop if the misalignment stays the same throughout the >> execution of the loop. As above, this is the case if the stride of >> the dataref evenly divides by the vector size. */ >> else >> { >> tree step = DR_STEP (dr); >> unsigned vf = loop ? LOOP_VINFO_VECT_FACTOR (loop_vinfo) : 1; >> >> if (tree_fits_shwi_p (step) >> && ((tree_to_shwi (step) * vf) >> % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0)) >> { >> if (dump_enabled_p ()) >> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, >> "step doesn't divide the vector-size.\n"); >> misalign = NULL_TREE; >> } >> } >> >> so I guess with the change we may end up with worse (misalign is >> always NULL?) or bogus (with DR_STEP == 0 the test is always false) >> alignment analysis results for BB vectorization? I guess worse >> or we had the issue before if DR_STEP was just not analyzable. > > DR_STEP will always be zero for bb vectorisation, so the results > shouldn't get worse. But the value that was previously a nonzero > step is now part of the base or offset instead (the choice between > the two being the same as it would be for get_inner_reference). > We still take the alignments of those into account, so I think > we should be safe (at least after DR_BASE_ALIGNMENT).
I think for DR_ALIGNED_TO (dr) = size_int (highest_pow2_factor (offset_iv.base)); we have a harder time extracting alignment from sth like (sizetype)i * 4 + 8 than if the base is zero. But yes, in principle we can do that and hopefully SCEV analysis is not much more powerful than highest_pow2_factor. > Like you say, we previously had the same situation for bases > that weren't simple IVs, or for bases that were simple IVs but > had an invariant rather than constant step. Yes. >> The testcase that now gets vectorized tried to use the asm to prevent >> vectorization but that of course doesn't work for BB vectorization. >> Using volatile stores might. I guess the chance that we miscompile >> both loops in a way that we still pass the testcase (even the compare >> loop could be BB vectorized I guess) is unlikely. > > Putting an extra volatile asm between the real and imag stores stops > the vectorisation, if you'd prefer that. No, your patch is fine. >> The patch is ok but I guess we need to keep an eye on BB vectorization >> results for targets w/o unaligned vector loads/stores for the above >> alignment issue? Still keep an eye on SCEV vs. highest_pow2_factor from testsuite fallout. Patch still ok, Richard. > Thanks, > Richard