pengfei added inline comments.
================ Comment at: clang/include/clang/Driver/Options.td:4595-4596 def mno_avxvnniint8 : Flag<["-"], "mno-avxvnniint8">, Group<m_x86_Features_Group>; +def mavxneconvert : Flag<["-"], "mavxneconvert">, Group<m_x86_Features_Group>; +def mno_avxneconvert : Flag<["-"], "mno-avxneconvert">, Group<m_x86_Features_Group>; def mavxvnni : Flag<["-"], "mavxvnni">, Group<m_x86_Features_Group>; ---------------- Need to move it before `mavxvnniint8 `. ================ Comment at: clang/lib/Basic/Targets/X86.cpp:1033 .Case("avxvnniint8", HasAVXVNNIINT8) + .Case("avxneconvert", HasAVXNECONVERT) .Case("bmi", HasBMI) ---------------- Move it ahead. ================ Comment at: clang/lib/Headers/avx512vlbf16intrin.h:164 +#define _mm_cvtneps_pbh(A) \ + ((__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)(A))) ---------------- pengfei wrote: > RKSimon wrote: > > Is there no way for __attribute__ to allow different attribute permutations? > > > > Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming convention? > > Is there no way for attribute to allow different attribute permutations? > > We have discussed this problem with GCC folks. There are two problems here: > 1. Unlike builtins, function attributes are more generic. It may introduce a > lot of checks between callers and callees. I had a research to limit it to > `__always_inline__` functions only. However, Clang handles inlining in > middle-end, we don't have such information in the front-end. Besides, we > don't know how to merge different permutations if they are inlining to the > same function. > 2. We don't know how to put the permutations into IR's function attributes. > We need to preserve all permutations for inlining reference, but the backend > needs a determine feature list rather than selective. It's better to use `__builtin_ia32_cvtneps2bf16_128`. ================ Comment at: clang/lib/Headers/avxneconvertintrin.h:106 +/// +/// This intrinsic corresponds to the \c VBCSTNEBF162PS instruction. +/// ---------------- VBCSTNESH2PS ================ Comment at: clang/lib/Headers/avxneconvertintrin.h:139 +/// +/// This intrinsic corresponds to the \c VBCSTNEBF162PS instruction. +/// ---------------- VBCSTNESH2PS ================ Comment at: clang/lib/Headers/avxneconvertintrin.h:207 +/// \param __A +/// A pointer to a 256-bit memory location containing 8 consecutive +/// BF16 (16-bit) floating-point values. ---------------- 16 ================ Comment at: clang/lib/Headers/avxneconvertintrin.h:273 +/// \param __A +/// A pointer to a 256-bit memory location containing 8 consecutive +/// half-precision (16-bit) floating-point values. ---------------- 16 ================ Comment at: clang/lib/Headers/avxneconvertintrin.h:339 +/// \param __A +/// A pointer to a 256-bit memory location containing 8 consecutive +/// BF16 (16-bit) floating-point values. ---------------- 16 ================ Comment at: clang/lib/Headers/avxneconvertintrin.h:405 +/// \param __A +/// A pointer to a 256-bit memory location containing 8 consecutive +/// half-precision (16-bit) floating-point values. ---------------- 16 ================ Comment at: clang/test/Preprocessor/x86_target_features.c:593-599 +// RUN: %clang -target i686-unknown-linux-gnu -march=atom -mavxneconvert -x c -E -dM -o - %s | FileCheck -check-prefix=AVXNECONVERT %s +// AVXNECONVERT: #define __AVXNECONVERT__ 1 + +// RUN: %clang -target i686-unknown-linux-gnu -march=atom -mno-avxneconvert -x c -E -dM -o - %s | FileCheck -check-prefix=NO-AVXNECONVERT %s +// RUN: %clang -target i686-unknown-linux-gnu -march=atom -mavxneconvert -mno-avx2 -x c -E -dM -o - %s | FileCheck -check-prefix=NO-AVXNECONVERT %s + +// NO-AVXNECONVERT-NOT: #define __AVXNECONVERT__ 1 ---------------- Should we check `__AVX2__` like we did for AVXVNNI? ================ Comment at: llvm/lib/Support/Host.cpp:1818 + Features["avxneconvert"] = HasLeaf7Subleaf1 && ((EDX >> 5) & 1) && HasAVXSave; + ---------------- Move it ahead and remove the blank line. ================ Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:2186-2203 if (!Subtarget.useSoftFloat() && Subtarget.hasBF16()) { addRegisterClass(MVT::v8bf16, &X86::VR128XRegClass); addRegisterClass(MVT::v16bf16, &X86::VR256XRegClass); addRegisterClass(MVT::v32bf16, &X86::VR512RegClass); // We set the type action of bf16 to TypeSoftPromoteHalf, but we don't // provide the method to promote BUILD_VECTOR. Set the operation action // Custom to do the customization later. ---------------- How about merge it here? ================ Comment at: llvm/lib/Target/X86/X86InstrSSE.td:8260-8261 +let Predicates = [HasAVXNECONVERT] in { + defm VBCSTNEBF162PS : AVX_NE_CONVERT_BASE<0xb1, "vbcstnebf162ps", i16mem, + i16mem>, T8XS; + defm VBCSTNESH2PS : AVX_NE_CONVERT_BASE<0xb1, "vbcstnesh2ps", f16mem, f16mem>, ---------------- This can be f16 mem now. ================ Comment at: llvm/lib/Target/X86/X86InstrSSE.td:8264-8265 + T8PD; + defm VCVTNEEBF162PS : AVX_NE_CONVERT_BASE<0xb0, "vcvtneebf162ps", i128mem, + i256mem>, T8XS; + defm VCVTNEEPH2PS : AVX_NE_CONVERT_BASE<0xb0, "vcvtneeph2ps", f128mem, ---------------- f128mem, f256mem ================ Comment at: llvm/lib/Target/X86/X86InstrSSE.td:8268-8269 + f256mem>, T8PD; + defm VCVTNEOBF162PS : AVX_NE_CONVERT_BASE<0xb0, "vcvtneobf162ps", i128mem, + i256mem>, T8XD; + defm VCVTNEOPH2PS : AVX_NE_CONVERT_BASE<0xb0, "vcvtneoph2ps", f128mem, ---------------- ditto. ================ Comment at: llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll:129-140 +declare <8 x bfloat> @llvm.x86.vcvtneps2bf16256(<8 x float> %A) + +define <8 x bfloat> @test_mm256_cvtneps2bf16_256_variant(<8 x float> %A) local_unnamed_addr #2 { +; CHECK-LABEL: test_mm256_cvtneps2bf16_256_variant: +; CHECK: # %bb.0: # %entry +; CHECK-NEXT: vcvtneps2bf16 %ymm0, %xmm0 # encoding: [0x62,0xf2,0x7e,0x28,0x72,0xc0] +; CHECK-NEXT: vzeroupper # encoding: [0xc5,0xf8,0x77] ---------------- You don't need to add them here, just another RUN in below file should be enough, e.g., ``` ; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown --show-mc-encoding -mattr=+avx512bf16,+avx512vl | FileCheck %s --check-prefix=AVX512BF16 ; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown --show-mc-encoding -mattr=+avx512bf16,+avx512vl | FileCheck %s --check-prefix=AVX512BF16 ``` ================ Comment at: llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll:2 +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown --show-mc-encoding -mattr=+avxneconvert | FileCheck %s --check-prefixes=X64 +; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown --show-mc-encoding -mattr=+avxneconvert | FileCheck %s --check-prefixes=X86 ---------------- --check-prefixes=CHECK,X64 ================ Comment at: llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll:3 +; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown --show-mc-encoding -mattr=+avxneconvert | FileCheck %s --check-prefixes=X64 +; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown --show-mc-encoding -mattr=+avxneconvert | FileCheck %s --check-prefixes=X86 + ---------------- --check-prefixes=CHECK,X86 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135930/new/ https://reviews.llvm.org/D135930 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits