On Wed, Jun 14, 2023 at 1:55 PM Jan Beulich via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Like is already the case for the AVX/AVX2 form, VMOVDDUP - acting on > double precision floating values - is more appropriate to use here, and > it can also result in shorter insn encodings when source is memory or > %xmm0...%xmm7, and no masking is applied (in allowing a 2-byte VEX > prefix then instead of a 3-byte one). > > gcc/ > > * config/i386/sse.md (<avx512>_vec_dup<mode><mask_name>): Use > vmovddup. Ok for trunk. > > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -25724,9 +25724,9 @@ > "TARGET_AVX512F" > { > /* There is no DF broadcast (in AVX-512*) to 128b register. > - Mimic it with integer variant. */ > + Mimic it with vmovddup, just like vec_dupv2df<mask_name> does. */ > if (<MODE>mode == V2DFmode) > - return "vpbroadcastq\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}"; > + return "vmovddup\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}"; > > return "v<sseintprefix>broadcast<bcstscalarsuff>\t{%1, > %0<mask_operand2>|%0<mask_operand2>, %<iptr>1}"; > }
-- BR, Hongtao