On Wed, Jun 14, 2023 at 1:55 PM Jan Beulich via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Like is already the case for the AVX/AVX2 form, VMOVDDUP - acting on
> double precision floating values - is more appropriate to use here, and
> it can also result in shorter insn encodings when source is memory or
> %xmm0...%xmm7, and no masking is applied (in allowing a 2-byte VEX
> prefix then instead of a 3-byte one).
>
> gcc/
>
>         * config/i386/sse.md (<avx512>_vec_dup<mode><mask_name>): Use
>         vmovddup.
Ok for trunk.
>
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -25724,9 +25724,9 @@
>    "TARGET_AVX512F"
>  {
>    /*  There is no DF broadcast (in AVX-512*) to 128b register.
> -      Mimic it with integer variant.  */
> +      Mimic it with vmovddup, just like vec_dupv2df<mask_name> does.  */
>    if (<MODE>mode == V2DFmode)
> -    return "vpbroadcastq\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}";
> +    return "vmovddup\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}";
>
>    return "v<sseintprefix>broadcast<bcstscalarsuff>\t{%1, 
> %0<mask_operand2>|%0<mask_operand2>, %<iptr>1}";
>  }



-- 
BR,
Hongtao

Reply via email to