Re: [PATCH PR85804]Fix wrong code by correcting bump step computation in vector(1) load of single-element group access

Bin.Cheng Wed, 23 May 2018 03:53:39 -0700

On Wed, May 23, 2018 at 11:19 AM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Tue, May 22, 2018 at 2:11 PM Richard Sandiford <
> richard.sandif...@linaro.org> wrote:
>
>> Richard Biener <richard.guent...@gmail.com> writes:
>> > On Mon, May 21, 2018 at 3:14 PM Bin Cheng <bin.ch...@arm.com> wrote:
>> >
>> >> Hi,
>> >> As reported in PR85804, bump step is wrongly computed for vector(1)
> load
>> > of
>> >> single-element group access.  This patch fixes the issue by correcting
>> > bump
>> >> step computation for the specific VMAT_CONTIGUOUS case.
>> >
>> >> Bootstrap and test on x86_64 and AArch64 ongoing, is it OK?
>> >
>> > To me it looks like the classification as VMAT_CONTIGUOUS is bogus.
>> > We'd fall into the grouped_load case otherwise which should handle
>> > the situation correctly?
>> >
>> > Richard?
>
>> Yeah, I agree.  I mentioned to Bin privately that that was probably
>> a misstep and that we should instead continue to treat them as
>> VMAT_CONTIGUOUS_PERMUTE, but simply select the required vector
>> from the array of loaded vectors, instead of doing an actual permute.
>
> I'd classify them as VMAT_ELEMENTWISE instead.  CONTIGUOUS
> should be only for no-gap vectors.  How do we classify single-element
> interleaving?  That would be another classification choice.
Yes, I suggested this offline too, but Richard may have more to say about this.
One thing worth noting is classifying it as VMAT_ELEMENTWISE would
disable vectorization in this case because of cost model issue as
commented at the end of get_load_store_type.


Thanks,
bin
>
>> (Note that VMAT_CONTIGUOUS is OK for stores, since we don't allow
>> gaps there.  But it might be easiest to handle both loads and stores
>> in the same way.)
>
>> Although it still seems weird to "vectorise" stuff to one element.
>> Why not leave the original scalar code in place, and put the onus on
>> whatever wants to produce or consume a V1 to do the appropriate
>> conversion?
>
> Yeah, V1 is somewhat awkward to deal with but I think the way we're
> doing it right now is OK.  Note if we leave the original scalar code in
> place we still have to deal with larger unrolling factors thus we'd have
> to duplicate the scalar code and have different IVs anyways.
>
> Richard.
>
>> Thanks,
>> Richard

Re: [PATCH PR85804]Fix wrong code by correcting bump step computation in vector(1) load of single-element group access

Reply via email to