On Thu, Jun 27, 2019 at 1:31 PM Jan Beulich <jbeul...@suse.com> wrote: > > >>> On 27.06.19 at 13:02, <ubiz...@gmail.com> wrote: > > On Thu, Jun 27, 2019 at 12:47 PM Jan Beulich <jbeul...@suse.com> wrote: > >> > >> >>> On 27.06.19 at 12:22, <ubiz...@gmail.com> wrote: > >> > On Thu, Jun 27, 2019 at 11:10 AM Jan Beulich <jbeul...@suse.com> wrote: > >> >> > >> >> >>> On 27.06.19 at 11:03, wrote: > >> >> > With just an "m" constraint misaligned memory operands won't be forced > >> >> > into a register, and hence cause #GP. So far this was guaranteed only > >> >> > in the case that CVT{,T}PD2DQ were chosen (which looks to be the case > >> >> > on > >> >> > x86-64 only). > >> >> > > >> >> > Instead of switching the second alternative to Bm, use just m on the > >> >> > first and replace nonimmediate_operand by vector_operand. > >> >> > >> >> While doing this and the others where I'm also replacing Bm by uses of > >> >> vector_operand, I've started wondering whether Bm couldn't (and then > >> >> shouldn't) be dropped altogether, replacing it everywhere by "m" > >> >> combined with vector_operand (or vector_memory_operand when > >> >> register operands aren't allowed anyway). > >> > > >> > No. Register allocator will propagate unaligned memory in non-AVX > >> > case, which is not allowed with vector_operand. > >> > >> I'm afraid I don't understand: Unaligned SIMD memory accesses will > >> generally fault in non-AVX mode, so such propagation would seem > >> wrong to me and hence would seem to be correctly not allowed. > >> Furthermore both vector_operand and Bm resolve to the same > >> vector_memory_operand. The TARGET_AVX check actually is inside > >> vector_memory_operand, i.e. affects both the same way. > > > > "Bm" *prevents* propagation of unaligned access for non-AVX targets. > > As said, register allocator does not care for operand predicates (it > > only looks at operand constraints), so it will propagate unaligned > > access with "m" operand. To avoid propagation, "Bm" should and does > > use vector_memory_operand constraint internally. > > Okay, I think I got it now (also because of your reply on the other > thread). It means in the patch here I need to retain Bm rather than > dropping it, too, and additionally use it on the other alternative.
The correct solution is a bit more complicated. I don't know if these instructions tolerate unaligned operand in non-AVX case. If they don't, then vector_operand should be used and the first alternative should be split to avx and non-avx part, where non-avx part uses Bm constraint. Uros.