Re: [PATCH] x86: fix CVT{,T}PD2PI insns

Uros Bizjak Thu, 27 Jun 2019 05:17:32 -0700

On Thu, Jun 27, 2019 at 2:14 PM Jan Beulich <jbeul...@suse.com> wrote:
>
> >>> On 27.06.19 at 14:07, <ubiz...@gmail.com> wrote:
> > On Thu, Jun 27, 2019 at 1:31 PM Jan Beulich <jbeul...@suse.com> wrote:
> >>
> >> >>> On 27.06.19 at 13:02, <ubiz...@gmail.com> wrote:
> >> > On Thu, Jun 27, 2019 at 12:47 PM Jan Beulich <jbeul...@suse.com> wrote:
> >> >>
> >> >> >>> On 27.06.19 at 12:22, <ubiz...@gmail.com> wrote:
> >> >> > On Thu, Jun 27, 2019 at 11:10 AM Jan Beulich <jbeul...@suse.com> 
> >> >> > wrote:
> >> >> >>
> >> >> >> >>> On 27.06.19 at 11:03,  wrote:
> >> >> >> > With just an "m" constraint misaligned memory operands won't be 
> >> >> >> > forced
> >> >> >> > into a register, and hence cause #GP. So far this was guaranteed 
> >> >> >> > only
> >> >> >> > in the case that CVT{,T}PD2DQ were chosen (which looks to be the 
> >> >> >> > case on
> >> >> >> > x86-64 only).
> >> >> >> >
> >> >> >> > Instead of switching the second alternative to Bm, use just m on 
> >> >> >> > the
> >> >> >> > first and replace nonimmediate_operand by vector_operand.
> >> >> >>
> >> >> >> While doing this and the others where I'm also replacing Bm by uses 
> >> >> >> of
> >> >> >> vector_operand, I've started wondering whether Bm couldn't (and then
> >> >> >> shouldn't) be dropped altogether, replacing it everywhere by "m"
> >> >> >> combined with vector_operand (or vector_memory_operand when
> >> >> >> register operands aren't allowed anyway).
> >> >> >
> >> >> > No. Register allocator will propagate unaligned memory in non-AVX
> >> >> > case, which is not allowed with vector_operand.
> >> >>
> >> >> I'm afraid I don't understand: Unaligned SIMD memory accesses will
> >> >> generally fault in non-AVX mode, so such propagation would seem
> >> >> wrong to me and hence would seem to be correctly not allowed.
> >> >> Furthermore both vector_operand and Bm resolve to the same
> >> >> vector_memory_operand. The TARGET_AVX check actually is inside
> >> >> vector_memory_operand, i.e. affects both the same way.
> >> >
> >> > "Bm" *prevents* propagation of unaligned access for non-AVX targets.
> >> > As said, register allocator does not care for operand predicates (it
> >> > only looks at operand constraints), so it will propagate unaligned
> >> > access with "m" operand. To avoid propagation, "Bm" should and does
> >> > use vector_memory_operand constraint internally.
> >>
> >> Okay, I think I got it now (also because of your reply on the other
> >> thread). It means in the patch here I need to retain Bm rather than
> >> dropping it, too, and additionally use it on the other alternative.
> >
> > The correct solution is a bit more complicated. I don't know if these
> > instructions tolerate unaligned operand in non-AVX case.
>
> They don't.
>
> > If they
> > don't, then vector_operand should be used and the first alternative
> > should be split to avx and non-avx part, where non-avx part uses Bm
> > constraint.
>
> Why? Bm takes care to distinguish the AVX and non-AVX cases. That's
> how things work elsewhere too, afaict. The bug here really is that the
> (non-AVX-only) second alternative didn't also use Bm.


Ah, yes. So, please use vector_operand here and change the second
alternative to use Bm.

Patch is OK with this change.

Thanks,
Uros.

Re: [PATCH] x86: fix CVT{,T}PD2PI insns

Reply via email to