On Wed, May 23, 2018 at 1:10 PM Bin.Cheng <amker.ch...@gmail.com> wrote:
> On Wed, May 23, 2018 at 12:01 PM, Richard Sandiford > <richard.sandif...@linaro.org> wrote: > > "Bin.Cheng" <amker.ch...@gmail.com> writes: > >> On Wed, May 23, 2018 at 11:19 AM, Richard Biener > >> <richard.guent...@gmail.com> wrote: > >>> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford < > >>> richard.sandif...@linaro.org> wrote: > >>> > >>>> Richard Biener <richard.guent...@gmail.com> writes: > >>>> > On Mon, May 21, 2018 at 3:14 PM Bin Cheng <bin.ch...@arm.com> wrote: > >>>> > > >>>> >> Hi, > >>>> >> As reported in PR85804, bump step is wrongly computed for vector(1) > >>> load > >>>> > of > >>>> >> single-element group access. This patch fixes the issue by correcting > >>>> > bump > >>>> >> step computation for the specific VMAT_CONTIGUOUS case. > >>>> > > >>>> >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK? > >>>> > > >>>> > To me it looks like the classification as VMAT_CONTIGUOUS is bogus. > >>>> > We'd fall into the grouped_load case otherwise which should handle > >>>> > the situation correctly? > >>>> > > >>>> > Richard? > >>> > >>>> Yeah, I agree. I mentioned to Bin privately that that was probably > >>>> a misstep and that we should instead continue to treat them as > >>>> VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector > >>>> from the array of loaded vectors, instead of doing an actual permute. > >>> > >>> I'd classify them as VMAT_ELEMENTWISE instead. CONTIGUOUS > >>> should be only for no-gap vectors. How do we classify single-element > >>> interleaving? That would be another classification choice. > >> Yes, I suggested this offline too, but Richard may have more to say about this. > >> One thing worth noting is classifying it as VMAT_ELEMENTWISE would > >> disable vectorization in this case because of cost model issue as > >> commented at the end of get_load_store_type. > > > > Yeah, that's the problem. Using VMAT_ELEMENTWISE also means that we > > use a scalar load and then insert it into a vector, whereas all we want > > (and all we currently generate) is a single vector load. > > > > So if we classify them as VMAT_ELEMENTWISE, they'll be a special case > > for both costing (vector load rather than scalar load and vector > > construct) and code-generation. > Looks to me it will be a special case for VMAT_ELEMENTWISE or > VMAT_CONTIGUOUS* anyway, probably VMAT_CONTIGUOUS is the easiest one? The question is why we do if (memory_access_type == VMAT_GATHER_SCATTER || (!slp && memory_access_type == VMAT_CONTIGUOUS)) grouped_load = false; I think for the particular testcase removing this would fix the issue as well? Richard. > Thanks, > bin > > > > Thanks, > > Richard