[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

Freddy, Ye via Phabricator via cfe-commits Mon, 31 Oct 2022 06:56:24 -0700

FreddyYe marked an inline comment as done.
FreddyYe added inline comments.


================
Comment at: clang/lib/Headers/avx512vlbf16intrin.h:164
+#define _mm_cvtneps_pbh(A) \
+  ((__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)(A)))
 
----------------
pengfei wrote:
> FreddyYe wrote:
> > pengfei wrote:
> > > pengfei wrote:
> > > > RKSimon wrote:
> > > > > Is there no way for __attribute__ to allow different attribute 
> > > > > permutations?
> > > > > 
> > > > > Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming 
> > > > > convention?
> > > > > Is there no way for attribute to allow different attribute 
> > > > > permutations?
> > > > 
> > > > We have discussed this problem with GCC folks. There are two problems 
> > > > here:
> > > > 1. Unlike builtins, function attributes are more generic. It may 
> > > > introduce a lot of checks between callers and callees. I had a research 
> > > > to limit it to `__always_inline__` functions only. However, Clang 
> > > > handles inlining in middle-end, we don't have such information in the 
> > > > front-end. Besides, we don't know how to merge different permutations 
> > > > if they are inlining to the same function.
> > > > 2. We don't know how to put the permutations into IR's function 
> > > > attributes. We need to preserve all permutations for inlining 
> > > > reference, but the backend needs a determine feature list rather than 
> > > > selective.
> > > It's better to use `__builtin_ia32_cvtneps2bf16_128`.
> > I think __builtin_ia32_vcvtneps2bf16128 is also a "right" name.
> > 
> > See __builtin_ia32_vfmaddsubph256, __builtin_ia32_minph256...
> > 
> > And I admit naming conventions of clang builtins as well as LLVM IR 
> > builtins are confusing right now.
> The problem here is `16128` is a bit confusing, a `_` breaks it into 2 number.
> But I'm not insist on it :)
I did a try but found __builtin_ia32_cvtneps2bf16_256 existed for avx512bf16, 
and it's used for mask intrinsic lowering currently. What about not change this 
time? We can do a refine patch later for avx512bf16 builtins since they also 
have some redundant FE/codegen logics for 256/512 mask intrinsics.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135930/new/

https://reviews.llvm.org/D135930

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

Reply via email to