"Bin.Cheng" <amker.ch...@gmail.com> writes: > On Wed, May 23, 2018 at 11:19 AM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford < >> richard.sandif...@linaro.org> wrote: >> >>> Richard Biener <richard.guent...@gmail.com> writes: >>> > On Mon, May 21, 2018 at 3:14 PM Bin Cheng <bin.ch...@arm.com> wrote: >>> > >>> >> Hi, >>> >> As reported in PR85804, bump step is wrongly computed for vector(1) >> load >>> > of >>> >> single-element group access. This patch fixes the issue by correcting >>> > bump >>> >> step computation for the specific VMAT_CONTIGUOUS case. >>> > >>> >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK? >>> > >>> > To me it looks like the classification as VMAT_CONTIGUOUS is bogus. >>> > We'd fall into the grouped_load case otherwise which should handle >>> > the situation correctly? >>> > >>> > Richard? >> >>> Yeah, I agree. I mentioned to Bin privately that that was probably >>> a misstep and that we should instead continue to treat them as >>> VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector >>> from the array of loaded vectors, instead of doing an actual permute. >> >> I'd classify them as VMAT_ELEMENTWISE instead. CONTIGUOUS >> should be only for no-gap vectors. How do we classify single-element >> interleaving? That would be another classification choice. > Yes, I suggested this offline too, but Richard may have more to say about > this. > One thing worth noting is classifying it as VMAT_ELEMENTWISE would > disable vectorization in this case because of cost model issue as > commented at the end of get_load_store_type.
Yeah, that's the problem. Using VMAT_ELEMENTWISE also means that we use a scalar load and then insert it into a vector, whereas all we want (and all we currently generate) is a single vector load. So if we classify them as VMAT_ELEMENTWISE, they'll be a special case for both costing (vector load rather than scalar load and vector construct) and code-generation. Thanks, Richard