>>> On 27.06.19 at 14:07, <ubiz...@gmail.com> wrote: > On Thu, Jun 27, 2019 at 1:31 PM Jan Beulich <jbeul...@suse.com> wrote: >> >> >>> On 27.06.19 at 13:02, <ubiz...@gmail.com> wrote: >> > On Thu, Jun 27, 2019 at 12:47 PM Jan Beulich <jbeul...@suse.com> wrote: >> >> >> >> >>> On 27.06.19 at 12:22, <ubiz...@gmail.com> wrote: >> >> > On Thu, Jun 27, 2019 at 11:10 AM Jan Beulich <jbeul...@suse.com> wrote: >> >> >> >> >> >> >>> On 27.06.19 at 11:03, wrote: >> >> >> > With just an "m" constraint misaligned memory operands won't be >> >> >> > forced >> >> >> > into a register, and hence cause #GP. So far this was guaranteed only >> >> >> > in the case that CVT{,T}PD2DQ were chosen (which looks to be the >> >> >> > case on >> >> >> > x86-64 only). >> >> >> > >> >> >> > Instead of switching the second alternative to Bm, use just m on the >> >> >> > first and replace nonimmediate_operand by vector_operand. >> >> >> >> >> >> While doing this and the others where I'm also replacing Bm by uses of >> >> >> vector_operand, I've started wondering whether Bm couldn't (and then >> >> >> shouldn't) be dropped altogether, replacing it everywhere by "m" >> >> >> combined with vector_operand (or vector_memory_operand when >> >> >> register operands aren't allowed anyway). >> >> > >> >> > No. Register allocator will propagate unaligned memory in non-AVX >> >> > case, which is not allowed with vector_operand. >> >> >> >> I'm afraid I don't understand: Unaligned SIMD memory accesses will >> >> generally fault in non-AVX mode, so such propagation would seem >> >> wrong to me and hence would seem to be correctly not allowed. >> >> Furthermore both vector_operand and Bm resolve to the same >> >> vector_memory_operand. The TARGET_AVX check actually is inside >> >> vector_memory_operand, i.e. affects both the same way. >> > >> > "Bm" *prevents* propagation of unaligned access for non-AVX targets. >> > As said, register allocator does not care for operand predicates (it >> > only looks at operand constraints), so it will propagate unaligned >> > access with "m" operand. To avoid propagation, "Bm" should and does >> > use vector_memory_operand constraint internally. >> >> Okay, I think I got it now (also because of your reply on the other >> thread). It means in the patch here I need to retain Bm rather than >> dropping it, too, and additionally use it on the other alternative. > > The correct solution is a bit more complicated. I don't know if these > instructions tolerate unaligned operand in non-AVX case.
They don't. > If they > don't, then vector_operand should be used and the first alternative > should be split to avx and non-avx part, where non-avx part uses Bm > constraint. Why? Bm takes care to distinguish the AVX and non-AVX cases. That's how things work elsewhere too, afaict. The bug here really is that the (non-AVX-only) second alternative didn't also use Bm. Jan