nemanjai accepted this revision.
nemanjai added a comment.
This revision is now accepted and ready to land.

LGTM. This is a good idea and we should go ahead with this for anyone that uses 
`vec_promote`, but it might be a good idea to improve codegen for the insert 
which might be more common.



================
Comment at: llvm/test/CodeGen/PowerPC/vec-promote.ll:43
+
+define noundef <4 x float> @vec_promote_float_zeroed(ptr nocapture noundef 
readonly %p) {
+; CHECK-BE-LABEL: vec_promote_float_zeroed:
----------------
This code is absolutely terrible. Not only is the `lfs` super slow compared to 
`lfiwzx/lxsiwzx` that we actually want, but the two conversions and three 
permutes are super slow.

I think the change to `altivec.h` to produce better code for something like 
that is a good thing, but I wonder if something like this might come up in 
other contexts.

At least on Power9 and up, we can do much better than this. We don't do 
particularly well regardless of whether we're using a zero vector input or an 
arbitrary vector: https://godbolt.org/z/79fx8nsdP


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158487/new/

https://reviews.llvm.org/D158487

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to