> From: Hongtao Liu <[email protected]>
> Sent: Wednesday, December 10, 2025 10:32 AM
>
> I didn't find the patch in my gmail box, so I'll reply there.
>
> >@@ -268,6 +269,7 @@ enum processor_features
> > FEATURE_USER_MSR,
> > FEATURE_AVX10_1 = 114,
> > FEATURE_AVX10_2 = 116,
> >+ FEATURE_AVX512BMM,
> > FEATURE_AMX_AVX512,
> > FEATURE_AMX_TF32,
>
> The new added feature should be inserted at the end(before
> CPU_FEATURE_MAX)
Also for the newly added processor, it should be inserted at the
end (before CPU_SUBTYPE_MAX).
>
> +extern __inline __m512i
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_maskz_vbitrevb_epi8 (__mmask64 __U, __m512i __A)
> +{
> + return (__m512i) __builtin_ia32_vbitrevb512_mask ((__v64qi) __A,
> + (__v64qi)
> + _mm512_setzero_epi8 (),
> + (__mmask64) __U);
> +}
> +
> +extern __inline __m512i
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_vbitrevb_epi8 (__m512i __A)
> +{
> + return (__m512i) __builtin_ia32_vbitrevb512_mask ((__v64qi) __A,
> + (__v64qi)
> + _mm512_undefined_epi8 (),
>
> You can directly use _mm512_undefined_epi32 and _mm512_setzero_epi32,
> and the definition of _mm512_setzero_epi8/_mm512_undefined_epi8 is not
> needed.
>
One more comment on intrin file. Typically we will omit "v" since
_mm[,256,512] has mentioned it is vector operations.
For bitrev, do we still need the "b" after bitrev? epi8 has implied byte.
It would be great if you could change the name to
_mm_bitrev_epi8
For bmac[or,xor]16x16x16, I will question the epi16 usage since
it is actually 16*16 bit matrix. The data here is not a word although
in machine description file we are using HI. epi16 could be misleading.
>
> extern __inline __m128i
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm128_maskz_vbitrevb_epi8 (__mmask16 __U, __m128i __A)
> +{
> + return (__m128i) __builtin_ia32_vbitrevb128_mask ((__v16qi) __A,
> + (__v16qi)
> + _mm128_setzero_epi8 (),
> + (__mmask16) __U);
> +}
> +
> +extern __inline __m128i
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm128_vbitrevb_epi8 (__m128i __A)
> +{
> + return (__m128i) __builtin_ia32_vbitrevb128_mask ((__v16qi) __A,
> + (__v16qi)
> + _mm128_undefined_epi8 (),
> + (__mmask16) -1);
> +}
> Similar here.
>
>
> +# AVX512BMM builtins
> +DEF_FUNCTION_TYPE (V16QI, V16QI, UHI)
> +DEF_FUNCTION_TYPE (V32QI, V32QI, USI)
> +DEF_FUNCTION_TYPE (V64QI, V64QI, UDI)
> +
>
> I didn't see any builtin defined with V16QI_FTYPE_V16QI_UHI, so it's
> redundant?
>
> + case V16QI_FTYPE_V16QI_UHI:
> + case V32QI_FTYPE_V32QI_USI:
> + case V64QI_FTYPE_V64QI_UDI:
> + nargs = 2;
> Ditto.
>
> +(define_mode_iterator VI1_AVX512BMM_HI
> + [V32HI (V16HI "TARGET_AVX512VL")])
>
> Better with name of VI2_256_512_AVX512VL
>
>
> +(define_mode_iterator VI1_AVX512BMM_QI
> + [V64QI (V32QI "TARGET_AVX512VL") (V16QI "TARGET_AVX512VL")])
> You can use existed VI1_AVX512VL, no need to define a new mode iterator.
>
>
> Other parts, the implementation looks ok.
Comment on i386.opt.urls: There should be an empty line at the end of
the file when generating it. Omitting that line might cause build bot report
errors.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> +@item znver6
> +AMD Family 1ah core based CPUs with x86-64 instruction set support. (This
> +supersets BMI, BMI2, CLWB, F16C, FMA, FSGSBASE, AVX, AVX2, ADCX, RDSEED,
> +MWAITX, SHA, CLZERO, AES, PCLMUL, CX16, MOVBE, MMX, SSE, SSE2, SSE3, SSE4A,
> +SSSE3, SSE4.1, SSE4.2, ABM, XSAVEC, XSAVES, CLFLUSHOPT, POPCNT, RDPID,
> +WBNOINVD, PKU, VPCLMULQDQ, VAES, AVX512F, AVX512DQ, AVX512IFMA, AVX512CD,
> +AVX512BW, AVX512VL, AVX512BF16, AVX512VBMI, AVX512VBMI2, AVX512VNNI,
> +AVX512BITALG, AVX512VPOPCNTDQ, GFNI, AVXVNNI, MOVDIRI, MOVDIR64B,
> +AVX512VP2INTERSECT, AVXNECONVERT, AVX512BMM, PREFETCHI and
> +64-bit instruction set extensions.)
Nits: You added AVXVNNIINT8, AVXIFMA, AVX512FP16 in implementation but not
adding them here.
Comment on tests: I am not sure if it could pass all the tests under i386.
Could you pass sse-2* tests?
Thx,
Haochen
> --
> BR,
> Hongtao