On Tue, May 22, 2018 at 2:11 PM Richard Sandiford < richard.sandif...@linaro.org> wrote:
> Richard Biener <richard.guent...@gmail.com> writes: > > On Mon, May 21, 2018 at 3:14 PM Bin Cheng <bin.ch...@arm.com> wrote: > > > >> Hi, > >> As reported in PR85804, bump step is wrongly computed for vector(1) load > > of > >> single-element group access. This patch fixes the issue by correcting > > bump > >> step computation for the specific VMAT_CONTIGUOUS case. > > > >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK? > > > > To me it looks like the classification as VMAT_CONTIGUOUS is bogus. > > We'd fall into the grouped_load case otherwise which should handle > > the situation correctly? > > > > Richard? > Yeah, I agree. I mentioned to Bin privately that that was probably > a misstep and that we should instead continue to treat them as > VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector > from the array of loaded vectors, instead of doing an actual permute. I'd classify them as VMAT_ELEMENTWISE instead. CONTIGUOUS should be only for no-gap vectors. How do we classify single-element interleaving? That would be another classification choice. > (Note that VMAT_CONTIGUOUS is OK for stores, since we don't allow > gaps there. But it might be easiest to handle both loads and stores > in the same way.) > Although it still seems weird to "vectorise" stuff to one element. > Why not leave the original scalar code in place, and put the onus on > whatever wants to produce or consume a V1 to do the appropriate > conversion? Yeah, V1 is somewhat awkward to deal with but I think the way we're doing it right now is OK. Note if we leave the original scalar code in place we still have to deal with larger unrolling factors thus we'd have to duplicate the scalar code and have different IVs anyways. Richard. > Thanks, > Richard