[PATCH] D158487: [PowerPC][altivec] Optimize codegen of vec_promote

Nemanja Ivanovic via Phabricator via cfe-commits Wed, 23 Aug 2023 14:04:22 -0700

nemanjai accepted this revision.
nemanjai added a comment.
This revision is now accepted and ready to land.


LGTM. This is a good idea and we should go ahead with this for anyone that uses 
`vec_promote`, but it might be a good idea to improve codegen for the insert 
which might be more common.



================
Comment at: llvm/test/CodeGen/PowerPC/vec-promote.ll:43
+
+define noundef <4 x float> @vec_promote_float_zeroed(ptr nocapture noundef 
readonly %p) {
+; CHECK-BE-LABEL: vec_promote_float_zeroed:
----------------
This code is absolutely terrible. Not only is the `lfs` super slow compared to 
`lfiwzx/lxsiwzx` that we actually want, but the two conversions and three 
permutes are super slow.

I think the change to `altivec.h` to produce better code for something like 
that is a good thing, but I wonder if something like this might come up in 
other contexts.

At least on Power9 and up, we can do much better than this. We don't do 
particularly well regardless of whether we're using a zero vector input or an 
arbitrary vector: https://godbolt.org/z/79fx8nsdP


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158487/new/

https://reviews.llvm.org/D158487

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D158487: [PowerPC][altivec] Optimize codegen of vec_promote

Reply via email to