On Thu, Mar 7, 2019 at 4:09 PM Jakub Jelinek <ja...@redhat.com> wrote: > > On Thu, Mar 07, 2019 at 08:11:53AM +0100, Uros Bizjak wrote: > > > +(define_insn "*avx512f_load<mode>_mask" > > > + [(set (match_operand:<ssevecmode> 0 "register_operand" "=v") > > > + (vec_merge:<ssevecmode> > > > + (vec_merge:<ssevecmode> > > > + (vec_duplicate:<ssevecmode> > > > + (match_operand:MODEF 1 "memory_operand" "m")) > > > + (match_operand:<ssevecmode> 2 "nonimm_or_0_operand" "0C") > > > + (match_operand:QI 3 "nonmemory_operand" "Yk")) > > > > Is there a reason to have nonmemory_operand predicate here instead of > > register_operand? > > Thanks for catching that up, that was from my earlier attempt to have > Yk,n constraints and deal with that during output. For store it was > possible, for others there were some cases it couldn't handle but further > testing revealed that the combiner already handles most of the constant > mask cases right. > > Here is updated patch, I've changed this in two spots. It even improves the > constant 1 case (the only one that is still not optimized as much as it > should): > f4: > - movzbl .LC0(%rip), %eax > + movl $1, %eax > kmovw %eax, %k1 > vmovsd (%rsi), %xmm0{%k1}{z} > ret > Tested so far with make check-gcc RUNTESTFLAGS=i386.exp=avx512f-vmovs*.c > and compiling/eyeballing differences on the short testcase I've posted > in the description with also the u, -> 1, and u, -> 0, changes, appart > from the above f4 no differences. > > Ok for trunk if it passes another full bootstrap/regtest? > > 2019-03-07 Jakub Jelinek <ja...@redhat.com> > > PR target/89602 > * config/i386/sse.md (avx512f_mov<ssescalarmodelower>_mask, > *avx512f_load<mode>_mask, avx512f_store<mode>_mask): New define_insns. > (avx512f_load<mode>_mask): New define_expand. > * config/i386/i386-builtin.def (__builtin_ia32_loadsd_mask, > __builtin_ia32_loadss_mask, __builtin_ia32_storesd_mask, > __builtin_ia32_storess_mask, __builtin_ia32_movesd_mask, > __builtin_ia32_movess_mask): New builtins. > * config/i386/avx512fintrin.h (_mm_mask_load_ss, _mm_maskz_load_ss, > _mm_mask_load_sd, _mm_maskz_load_sd, _mm_mask_move_ss, > _mm_maskz_move_ss, _mm_mask_move_sd, _mm_maskz_move_sd, > _mm_mask_store_ss, _mm_mask_store_sd): New intrinsics. >
This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89630 This looks very strange since this patch only touched backend. -- H.J.