[llvm-branch-commits] [llvm] b1166e1 - [X86][AVX] combineX86ShufflesRecursively - attempt to constant fold before widening shuffle inputs

Simon Pilgrim via llvm-branch-commits Fri, 22 Jan 2021 05:23:48 -0800

Author: Simon Pilgrim
Date: 2021-01-22T13:19:35Z
New Revision: b1166e1317c54e9cfbb28b280af12313cf325a86


URL: 
https://github.com/llvm/llvm-project/commit/b1166e1317c54e9cfbb28b280af12313cf325a86
DIFF: 
https://github.com/llvm/llvm-project/commit/b1166e1317c54e9cfbb28b280af12313cf325a86.diff

LOG: [X86][AVX] combineX86ShufflesRecursively - attempt to constant fold before 
widening shuffle inputs

combineX86ShufflesConstants/canonicalizeShuffleMaskWithHorizOp can both 
handle/earlyout shuffles with inputs of different widths, so delay widening as 
late as possible to make it easier to match constant folds etc.

The plan is to eventually move the widening inside combineX86ShuffleChain so 
that we don't create any new nodes unless we successfully combine the shuffles.

Added: 
    

Modified: 
    llvm/lib/Target/X86/X86ISelLowering.cpp
    llvm/test/CodeGen/X86/vector-shuffle-combining-avx512bwvl.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 895a02e5c98e..a293c48a824a 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -36610,6 +36610,17 @@ static SDValue combineX86ShufflesRecursively(
     }
   }
 
+  // Attempt to constant fold all of the constant source ops.
+  if (SDValue Cst = combineX86ShufflesConstants(
+          Ops, Mask, Root, HasVariableMask, DAG, Subtarget))
+    return Cst;
+
+  // Canonicalize the combined shuffle mask chain with horizontal ops.
+  // NOTE: This will update the Ops and Mask.
+  if (SDValue HOp = canonicalizeShuffleMaskWithHorizOp(
+          Ops, Mask, RootSizeInBits, SDLoc(Root), DAG, Subtarget))
+    return DAG.getBitcast(Root.getValueType(), HOp);
+
   // Widen any subvector shuffle inputs we've collected.
   if (any_of(Ops, [RootSizeInBits](SDValue Op) {
         return Op.getValueSizeInBits() < RootSizeInBits;
@@ -36622,17 +36633,6 @@ static SDValue combineX86ShufflesRecursively(
     resolveTargetShuffleInputsAndMask(Ops, Mask);
   }
 
-  // Attempt to constant fold all of the constant source ops.
-  if (SDValue Cst = combineX86ShufflesConstants(
-          Ops, Mask, Root, HasVariableMask, DAG, Subtarget))
-    return Cst;
-
-  // Canonicalize the combined shuffle mask chain with horizontal ops.
-  // NOTE: This will update the Ops and Mask.
-  if (SDValue HOp = canonicalizeShuffleMaskWithHorizOp(
-          Ops, Mask, RootSizeInBits, SDLoc(Root), DAG, Subtarget))
-    return DAG.getBitcast(Root.getValueType(), HOp);
-
   // We can only combine unary and binary shuffle mask cases.
   if (Ops.size() <= 2) {
     // Minor canonicalization of the accumulated shuffle mask to make it easier

diff  --git a/llvm/test/CodeGen/X86/vector-shuffle-combining-avx512bwvl.ll 
b/llvm/test/CodeGen/X86/vector-shuffle-combining-avx512bwvl.ll
index 2c53579f7627..c358250305a7 100644
--- a/llvm/test/CodeGen/X86/vector-shuffle-combining-avx512bwvl.ll
+++ b/llvm/test/CodeGen/X86/vector-shuffle-combining-avx512bwvl.ll
@@ -108,13 +108,12 @@ define void @PR46178(i16* %0) {
 ; X86-NEXT:    vmovdqu (%eax), %ymm1
 ; X86-NEXT:    vpmovqw %ymm0, %xmm0
 ; X86-NEXT:    vpmovqw %ymm1, %xmm1
-; X86-NEXT:    vinserti128 $1, %xmm1, %ymm0, %ymm0
-; X86-NEXT:    vpsllw $8, %ymm0, %ymm0
-; X86-NEXT:    vpsraw $8, %ymm0, %ymm0
-; X86-NEXT:    vmovapd {{.*#+}} ymm1 = [0,0,2,0,4,0,4,0]
-; X86-NEXT:    vxorpd %xmm2, %xmm2, %xmm2
-; X86-NEXT:    vpermi2pd %ymm2, %ymm0, %ymm1
-; X86-NEXT:    vmovupd %ymm1, (%eax)
+; X86-NEXT:    vpsllw $8, %xmm1, %xmm1
+; X86-NEXT:    vpsraw $8, %xmm1, %xmm1
+; X86-NEXT:    vpsllw $8, %xmm0, %xmm0
+; X86-NEXT:    vpsraw $8, %xmm0, %xmm0
+; X86-NEXT:    vshufpd {{.*#+}} ymm0 = ymm0[0],ymm1[0],ymm0[2],ymm1[3]
+; X86-NEXT:    vmovupd %ymm0, (%eax)
 ; X86-NEXT:    vzeroupper
 ; X86-NEXT:    retl
 ;


        
_______________________________________________
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] b1166e1 - [X86][AVX] combineX86ShufflesRecursively - attempt to constant fold before widening shuffle inputs

Reply via email to