On Fri, Jul 14, 2023 at 5:40 PM Jan Beulich via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Introduce a new alternative permitting all 32 registers to be used as
> source without AVX512VL, by broadcasting to the full 512 bits in that
> case. (The insn would also permit all registers to be used as
> destination, but V2DFmode doesn't.)
The patch looks technically ok, but considering we don't have a real
CPU with only AVX512F but no AVX512VL, these optimisations for AVX512F
only don't make much sense, but rather increase the burden for
maintenance.
(For now, AVX512VL+AVX512CD+AVX512BW+AVX512DQ is a base set after
skylake-avx512, users are more likely to use -march=$PROCESSOR for
avx512.)
For this(and those previous AVX512F only) optimised patch, I think
it's helpful to help understand the pattern, so I'll approve of this
patch. But I hope we don't spend too much time on such optimisations
(unless there is an AVX512F only processor).
>
> gcc/
>
>         * config/i386/sse.md (vec_dupv2df<mask_name>): Add new AVX512F
>         alternative. Move AVX512VL part of condition to new "enabled"
>         attribute.
> ---
> Because of the V2DF restriction, in principle the new source constraint
> could also omit 'm'.
>
> Can't the latter two of the original alternatives be folded, by using
> Yvm instead of xm/vm?
I think yes.
>
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -13761,18 +13761,27 @@
>     (set_attr "mode" "DF,DF,V1DF,V1DF,V1DF,V2DF,V1DF,V1DF,V1DF")])
>
>  (define_insn "vec_dupv2df<mask_name>"
> -  [(set (match_operand:V2DF 0 "register_operand"     "=x,x,v")
> +  [(set (match_operand:V2DF 0 "register_operand"     "=x,x,v,v")
>         (vec_duplicate:V2DF
> -         (match_operand:DF 1 "nonimmediate_operand" " 0,xm,vm")))]
> -  "TARGET_SSE2 && <mask_avx512vl_condition>"
> +         (match_operand:DF 1 "nonimmediate_operand" "0,xm,vm,vm")))]
> +  "TARGET_SSE2"
>    "@
>     unpcklpd\t%0, %0
>     %vmovddup\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}
> -   vmovddup\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}"
> -  [(set_attr "isa" "noavx,sse3,avx512vl")
> -   (set_attr "type" "sselog1")
> -   (set_attr "prefix" "orig,maybe_vex,evex")
> -   (set_attr "mode" "V2DF,DF,DF")])
> +   vmovddup\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}
> +   vbroadcastsd\t{%1, }%g0<mask_operand2>{|, %1}"
> +  [(set_attr "isa" "noavx,sse3,avx512vl,*")
> +   (set_attr "type" "sselog1,ssemov,ssemov,ssemov")
> +   (set_attr "prefix" "orig,maybe_vex,evex,evex")
> +   (set_attr "mode" "V2DF,DF,DF,V8DF")
> +   (set (attr "enabled")
> +       (cond [(eq_attr "alternative" "3")
> +                (symbol_ref "TARGET_AVX512F && !TARGET_AVX512VL
> +                             && !TARGET_PREFER_AVX256")
> +              (match_test "<mask_avx512vl_condition>")
> +                (const_string "*")
> +             ]
> +             (symbol_ref "false")))])
>
>  (define_insn "vec_concatv2df"
>    [(set (match_operand:V2DF 0 "register_operand"     "=x,x,v,x,x, v,x,x")



-- 
BR,
Hongtao

Reply via email to