[llvm-branch-commits] [llvm] dda6003 - [AArch64] Attempt to sink mul operands
Author: Nicholas Guy Date: 2021-01-13T15:23:36Z New Revision: dda60035e9f0769c8907cdf6561489e0435c2275 URL: https://github.com/llvm/llvm-project/commit/dda60035e9f0769c8907cdf6561489e0435c2275 DIFF: https://github.com/llvm/llvm-project/commit/dda60035e9f0769c8907cdf6561489e0435c2275.diff LOG: [AArch64] Attempt to sink mul operands Following on from D91255, this patch is responsible for sinking relevant mul operands to the same block so that umull/smull instructions can be correctly generated by the mul combine implemented in the aforementioned patch. Differential revision: https://reviews.llvm.org/D91271 Added: llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll Modified: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Removed: diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp index b500cd534a1f..082fdf390786 100644 --- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -10956,6 +10956,43 @@ bool AArch64TargetLowering::shouldSinkOperands( return true; } + case Instruction::Mul: { +bool IsProfitable = false; +for (auto &Op : I->operands()) { + // Make sure we are not already sinking this operand + if (any_of(Ops, [&](Use *U) { return U->get() == Op; })) +continue; + + ShuffleVectorInst *Shuffle = dyn_cast(Op); + if (!Shuffle || !Shuffle->isZeroEltSplat()) +continue; + + Value *ShuffleOperand = Shuffle->getOperand(0); + InsertElementInst *Insert = dyn_cast(ShuffleOperand); + if (!Insert) +continue; + + Instruction *OperandInstr = dyn_cast(Insert->getOperand(1)); + if (!OperandInstr) +continue; + + ConstantInt *ElementConstant = + dyn_cast(Insert->getOperand(2)); + // Check that the insertelement is inserting into element 0 + if (!ElementConstant || ElementConstant->getZExtValue() != 0) +continue; + + unsigned Opcode = OperandInstr->getOpcode(); + if (Opcode != Instruction::SExt && Opcode != Instruction::ZExt) +continue; + + Ops.push_back(&Shuffle->getOperandUse(0)); + Ops.push_back(&Op); + IsProfitable = true; +} + +return IsProfitable; + } default: return false; } diff --git a/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll b/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll new file mode 100644 index ..966cf7b46daa --- /dev/null +++ b/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll @@ -0,0 +1,186 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc -mtriple=aarch64-none-linux-gnu < %s -o -| FileCheck %s + +define void @matrix_mul_unsigned(i32 %N, i32* nocapture %C, i16* nocapture readonly %A, i16 %val) { +; CHECK-LABEL: matrix_mul_unsigned: +; CHECK: // %bb.0: // %vector.header +; CHECK-NEXT:and w9, w3, #0x +; CHECK-NEXT:// kill: def $w0 killed $w0 def $x0 +; CHECK-NEXT:and x8, x0, #0xfff8 +; CHECK-NEXT:dup v0.4h, w9 +; CHECK-NEXT: .LBB0_1: // %vector.body +; CHECK-NEXT:// =>This Inner Loop Header: Depth=1 +; CHECK-NEXT:add x9, x2, w0, uxtw #1 +; CHECK-NEXT:ldp d1, d2, [x9] +; CHECK-NEXT:add x9, x1, w0, uxtw #2 +; CHECK-NEXT:subs x8, x8, #8 // =8 +; CHECK-NEXT:add w0, w0, #8 // =8 +; CHECK-NEXT:umull v1.4s, v0.4h, v1.4h +; CHECK-NEXT:umull v2.4s, v0.4h, v2.4h +; CHECK-NEXT:stp q1, q2, [x9] +; CHECK-NEXT:b.ne .LBB0_1 +; CHECK-NEXT: // %bb.2: // %for.end12 +; CHECK-NEXT:ret +vector.header: + %conv4 = zext i16 %val to i32 + %wide.trip.count = zext i32 %N to i64 + %0 = add nsw i64 %wide.trip.count, -1 + %min.iters.check = icmp ult i32 %N, 8 + %1 = trunc i64 %0 to i32 + %2 = icmp ugt i64 %0, 4294967295 + %n.vec = and i64 %wide.trip.count, 4294967288 + %broadcast.splatinsert = insertelement <4 x i32> undef, i32 %conv4, i32 0 + %broadcast.splat = shufflevector <4 x i32> %broadcast.splatinsert, <4 x i32> undef, <4 x i32> zeroinitializer + %broadcast.splatinsert31 = insertelement <4 x i32> undef, i32 %conv4, i32 0 + %broadcast.splat32 = shufflevector <4 x i32> %broadcast.splatinsert31, <4 x i32> undef, <4 x i32> zeroinitializer + %cmp.n = icmp eq i64 %n.vec, %wide.trip.count + br label %vector.body + +vector.body: ; preds = %vector.header, %vector.body + %index = phi i64 [ %index.next, %vector.body ], [ 0, %vector.header ] + %3 = trunc i64 %index to i32 + %4 = add i32 %N, %3 + %5 = zext i32 %4 to i64 + %6 = getelementptr inbounds i16, i16* %A, i64 %5 + %7 = bitcast i16* %6 to <4 x i16>* + %wide.load = load <4 x i16>, <4 x i16>* %7, align 2 + %8 = getelementptr inbounds i16, i16* %6, i64 4 + %9 = bitcast i16* %8 to <4 x i16>* + %wide.load30 = load <4 x i16>, <4 x i16>* %9, align 2 + %10 = zext <4
[llvm-branch-commits] [llvm] f5fcbe4 - [AArch64] Further restricts when a dup(*ext) can be rearranged
Author: Nicholas Guy Date: 2021-01-18T16:00:21Z New Revision: f5fcbe4e3c68584ef4858590a079f17593feabbd URL: https://github.com/llvm/llvm-project/commit/f5fcbe4e3c68584ef4858590a079f17593feabbd DIFF: https://github.com/llvm/llvm-project/commit/f5fcbe4e3c68584ef4858590a079f17593feabbd.diff LOG: [AArch64] Further restricts when a dup(*ext) can be rearranged In most cases, the dup(*ext) pattern can be rearranged to perform the extension on the vector side, allowing for further vector-specific optimisations to be made. However the initial checks for this conversion were insufficient, allowing invalid encodings to be attempted (causing compilation to fail). Differential Revision: https://reviews.llvm.org/D94778 Added: llvm/test/CodeGen/AArch64/aarch64-dup-ext-crash.ll Modified: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Removed: diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp index 6e4ac0f711dd..39c40ef0b36d 100644 --- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -11843,7 +11843,8 @@ static SDValue performCommonVectorExtendCombine(SDValue VectorShuffle, SDValue InsertVectorNode = DAG.getNode( InsertVectorElt.getOpcode(), DL, PreExtendVT, DAG.getUNDEF(PreExtendVT), - Extend.getOperand(0), DAG.getConstant(0, DL, MVT::i64)); + DAG.getAnyExtOrTrunc(Extend.getOperand(0), DL, PreExtendType), + DAG.getConstant(0, DL, MVT::i64)); std::vector ShuffleMask(TargetType.getVectorElementCount().getValue()); @@ -11851,9 +11852,8 @@ static SDValue performCommonVectorExtendCombine(SDValue VectorShuffle, DAG.getVectorShuffle(PreExtendVT, DL, InsertVectorNode, DAG.getUNDEF(PreExtendVT), ShuffleMask); - SDValue ExtendNode = - DAG.getNode(IsSExt ? ISD::SIGN_EXTEND : ISD::ZERO_EXTEND, DL, TargetType, - VectorShuffleNode, DAG.getValueType(TargetType)); + SDValue ExtendNode = DAG.getNode(IsSExt ? ISD::SIGN_EXTEND : ISD::ZERO_EXTEND, + DL, TargetType, VectorShuffleNode); return ExtendNode; } diff --git a/llvm/test/CodeGen/AArch64/aarch64-dup-ext-crash.ll b/llvm/test/CodeGen/AArch64/aarch64-dup-ext-crash.ll new file mode 100644 index ..51f91aa1b940 --- /dev/null +++ b/llvm/test/CodeGen/AArch64/aarch64-dup-ext-crash.ll @@ -0,0 +1,33 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc < %s -o -| FileCheck %s + +target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" +target triple = "aarch64-unknown-linux-gnu" + +; This test covers a case where an AArch64 DUP instruction is generated with an +; invalid encoding, resulting in a crash. We don't care about the specific output +; here, only that this case no longer causes said crash. +define dso_local i32 @dupext_crashtest(i32 %e) local_unnamed_addr { +; CHECK-LABEL: dupext_crashtest: +for.body.lr.ph: + %conv314 = zext i32 %e to i64 + br label %vector.memcheck + +vector.memcheck: ; preds = %for.body.lr.ph + br label %vector.ph + +vector.ph:; preds = %vector.memcheck + %broadcast.splatinsert = insertelement <2 x i64> poison, i64 %conv314, i32 0 + %broadcast.splat = shufflevector <2 x i64> %broadcast.splatinsert, <2 x i64> poison, <2 x i32> zeroinitializer + br label %vector.body + +vector.body: ; preds = %vector.body, %vector.ph + %wide.load = load <2 x i32>, <2 x i32>* undef, align 4 + %0 = zext <2 x i32> %wide.load to <2 x i64> + %1 = mul nuw <2 x i64> %broadcast.splat, %0 + %2 = trunc <2 x i64> %1 to <2 x i32> + %3 = select <2 x i1> undef, <2 x i32> undef, <2 x i32> %2 + %4 = bitcast i32* undef to <2 x i32>* + store <2 x i32> %3, <2 x i32>* %4, align 4 + br label %vector.body +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 16bf02c - Reland "[AArch64] Attempt to sink mul operands""
Author: Nicholas Guy Date: 2021-01-18T16:00:22Z New Revision: 16bf02c3a19d4e1f4a19cb243de612e17f54f5a9 URL: https://github.com/llvm/llvm-project/commit/16bf02c3a19d4e1f4a19cb243de612e17f54f5a9 DIFF: https://github.com/llvm/llvm-project/commit/16bf02c3a19d4e1f4a19cb243de612e17f54f5a9.diff LOG: Reland "[AArch64] Attempt to sink mul operands"" This relands dda60035e9f0769c8907cdf6561489e0435c2275, which was reverted by dbaa6a1858a42f72b683f700d3bd7a9632f7a518 Added: llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll Modified: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Removed: diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp index 39c40ef0b36d6..cc64e0e03ad88 100644 --- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -10965,6 +10965,43 @@ bool AArch64TargetLowering::shouldSinkOperands( return true; } + case Instruction::Mul: { +bool IsProfitable = false; +for (auto &Op : I->operands()) { + // Make sure we are not already sinking this operand + if (any_of(Ops, [&](Use *U) { return U->get() == Op; })) +continue; + + ShuffleVectorInst *Shuffle = dyn_cast(Op); + if (!Shuffle || !Shuffle->isZeroEltSplat()) +continue; + + Value *ShuffleOperand = Shuffle->getOperand(0); + InsertElementInst *Insert = dyn_cast(ShuffleOperand); + if (!Insert) +continue; + + Instruction *OperandInstr = dyn_cast(Insert->getOperand(1)); + if (!OperandInstr) +continue; + + ConstantInt *ElementConstant = + dyn_cast(Insert->getOperand(2)); + // Check that the insertelement is inserting into element 0 + if (!ElementConstant || ElementConstant->getZExtValue() != 0) +continue; + + unsigned Opcode = OperandInstr->getOpcode(); + if (Opcode != Instruction::SExt && Opcode != Instruction::ZExt) +continue; + + Ops.push_back(&Shuffle->getOperandUse(0)); + Ops.push_back(&Op); + IsProfitable = true; +} + +return IsProfitable; + } default: return false; } diff --git a/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll b/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll new file mode 100644 index 0..966cf7b46daa5 --- /dev/null +++ b/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll @@ -0,0 +1,186 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc -mtriple=aarch64-none-linux-gnu < %s -o -| FileCheck %s + +define void @matrix_mul_unsigned(i32 %N, i32* nocapture %C, i16* nocapture readonly %A, i16 %val) { +; CHECK-LABEL: matrix_mul_unsigned: +; CHECK: // %bb.0: // %vector.header +; CHECK-NEXT:and w9, w3, #0x +; CHECK-NEXT:// kill: def $w0 killed $w0 def $x0 +; CHECK-NEXT:and x8, x0, #0xfff8 +; CHECK-NEXT:dup v0.4h, w9 +; CHECK-NEXT: .LBB0_1: // %vector.body +; CHECK-NEXT:// =>This Inner Loop Header: Depth=1 +; CHECK-NEXT:add x9, x2, w0, uxtw #1 +; CHECK-NEXT:ldp d1, d2, [x9] +; CHECK-NEXT:add x9, x1, w0, uxtw #2 +; CHECK-NEXT:subs x8, x8, #8 // =8 +; CHECK-NEXT:add w0, w0, #8 // =8 +; CHECK-NEXT:umull v1.4s, v0.4h, v1.4h +; CHECK-NEXT:umull v2.4s, v0.4h, v2.4h +; CHECK-NEXT:stp q1, q2, [x9] +; CHECK-NEXT:b.ne .LBB0_1 +; CHECK-NEXT: // %bb.2: // %for.end12 +; CHECK-NEXT:ret +vector.header: + %conv4 = zext i16 %val to i32 + %wide.trip.count = zext i32 %N to i64 + %0 = add nsw i64 %wide.trip.count, -1 + %min.iters.check = icmp ult i32 %N, 8 + %1 = trunc i64 %0 to i32 + %2 = icmp ugt i64 %0, 4294967295 + %n.vec = and i64 %wide.trip.count, 4294967288 + %broadcast.splatinsert = insertelement <4 x i32> undef, i32 %conv4, i32 0 + %broadcast.splat = shufflevector <4 x i32> %broadcast.splatinsert, <4 x i32> undef, <4 x i32> zeroinitializer + %broadcast.splatinsert31 = insertelement <4 x i32> undef, i32 %conv4, i32 0 + %broadcast.splat32 = shufflevector <4 x i32> %broadcast.splatinsert31, <4 x i32> undef, <4 x i32> zeroinitializer + %cmp.n = icmp eq i64 %n.vec, %wide.trip.count + br label %vector.body + +vector.body: ; preds = %vector.header, %vector.body + %index = phi i64 [ %index.next, %vector.body ], [ 0, %vector.header ] + %3 = trunc i64 %index to i32 + %4 = add i32 %N, %3 + %5 = zext i32 %4 to i64 + %6 = getelementptr inbounds i16, i16* %A, i64 %5 + %7 = bitcast i16* %6 to <4 x i16>* + %wide.load = load <4 x i16>, <4 x i16>* %7, align 2 + %8 = getelementptr inbounds i16, i16* %6, i64 4 + %9 = bitcast i16* %8 to <4 x i16>* + %wide.load30 = load <4 x i16>, <4 x i16>* %9, align 2 + %10 = zext <4 x i16> %wide.load to <4 x i32> + %11 = zext <4 x i16> %wide.load30 to <4 x i32> + %12 = mul nuw nsw <4 x i32> %broadcast.splat, %10 + %13 = mul
[llvm-branch-commits] [llvm] 350247a - [AArch64] Rearrange mul(dup(sext/zext)) to mul(sext/zext(dup))
Author: Nicholas Guy Date: 2021-01-06T16:02:16Z New Revision: 350247a93c07906300b79955ff882004a92ae368 URL: https://github.com/llvm/llvm-project/commit/350247a93c07906300b79955ff882004a92ae368 DIFF: https://github.com/llvm/llvm-project/commit/350247a93c07906300b79955ff882004a92ae368.diff LOG: [AArch64] Rearrange mul(dup(sext/zext)) to mul(sext/zext(dup)) Performing this rearrangement allows for existing patterns to match cases where the vector may be built after an extend, instead of before. Differential Revision: https://reviews.llvm.org/D91255 Added: llvm/test/CodeGen/AArch64/aarch64-dup-ext-scalable.ll llvm/test/CodeGen/AArch64/aarch64-dup-ext.ll Modified: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Removed: diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp index 41dc285a368d..40435c12ca3b 100644 --- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -11705,9 +11705,152 @@ static bool IsSVECntIntrinsic(SDValue S) { return false; } +/// Calculates what the pre-extend type is, based on the extension +/// operation node provided by \p Extend. +/// +/// In the case that \p Extend is a SIGN_EXTEND or a ZERO_EXTEND, the +/// pre-extend type is pulled directly from the operand, while other extend +/// operations need a bit more inspection to get this information. +/// +/// \param Extend The SDNode from the DAG that represents the extend operation +/// \param DAG The SelectionDAG hosting the \p Extend node +/// +/// \returns The type representing the \p Extend source type, or \p MVT::Other +/// if no valid type can be determined +static EVT calculatePreExtendType(SDValue Extend, SelectionDAG &DAG) { + switch (Extend.getOpcode()) { + case ISD::SIGN_EXTEND: + case ISD::ZERO_EXTEND: +return Extend.getOperand(0).getValueType(); + case ISD::AssertSext: + case ISD::AssertZext: + case ISD::SIGN_EXTEND_INREG: { +VTSDNode *TypeNode = dyn_cast(Extend.getOperand(1)); +if (!TypeNode) + return MVT::Other; +return TypeNode->getVT(); + } + case ISD::AND: { +ConstantSDNode *Constant = +dyn_cast(Extend.getOperand(1).getNode()); +if (!Constant) + return MVT::Other; + +uint32_t Mask = Constant->getZExtValue(); + +if (Mask == UCHAR_MAX) + return MVT::i8; +else if (Mask == USHRT_MAX) + return MVT::i16; +else if (Mask == UINT_MAX) + return MVT::i32; + +return MVT::Other; + } + default: +return MVT::Other; + } + + llvm_unreachable("Code path unhandled in calculatePreExtendType!"); +} + +/// Combines a dup(sext/zext) node pattern into sext/zext(dup) +/// making use of the vector SExt/ZExt rather than the scalar SExt/ZExt +static SDValue performCommonVectorExtendCombine(SDValue VectorShuffle, +SelectionDAG &DAG) { + + ShuffleVectorSDNode *ShuffleNode = + dyn_cast(VectorShuffle.getNode()); + if (!ShuffleNode) +return SDValue(); + + // Ensuring the mask is zero before continuing + if (!ShuffleNode->isSplat() || ShuffleNode->getSplatIndex() != 0) +return SDValue(); + + SDValue InsertVectorElt = VectorShuffle.getOperand(0); + + if (InsertVectorElt.getOpcode() != ISD::INSERT_VECTOR_ELT) +return SDValue(); + + SDValue InsertLane = InsertVectorElt.getOperand(2); + ConstantSDNode *Constant = dyn_cast(InsertLane.getNode()); + // Ensures the insert is inserting into lane 0 + if (!Constant || Constant->getZExtValue() != 0) +return SDValue(); + + SDValue Extend = InsertVectorElt.getOperand(1); + unsigned ExtendOpcode = Extend.getOpcode(); + + bool IsSExt = ExtendOpcode == ISD::SIGN_EXTEND || +ExtendOpcode == ISD::SIGN_EXTEND_INREG || +ExtendOpcode == ISD::AssertSext; + if (!IsSExt && ExtendOpcode != ISD::ZERO_EXTEND && + ExtendOpcode != ISD::AssertZext && ExtendOpcode != ISD::AND) +return SDValue(); + + EVT TargetType = VectorShuffle.getValueType(); + EVT PreExtendType = calculatePreExtendType(Extend, DAG); + + if ((TargetType != MVT::v8i16 && TargetType != MVT::v4i32 && + TargetType != MVT::v2i64) || + (PreExtendType == MVT::Other)) +return SDValue(); + + EVT PreExtendVT = TargetType.changeVectorElementType(PreExtendType); + + if (PreExtendVT.getVectorElementCount() != TargetType.getVectorElementCount()) +return SDValue(); + + if (TargetType.getScalarSizeInBits() != PreExtendVT.getScalarSizeInBits() * 2) +return SDValue(); + + SDLoc DL(VectorShuffle); + + SDValue InsertVectorNode = DAG.getNode( + InsertVectorElt.getOpcode(), DL, PreExtendVT, DAG.getUNDEF(PreExtendVT), + Extend.getOperand(0), DAG.getConstant(0, DL, MVT::i64)); + + std::vector ShuffleMask(TargetType.getVectorElementCount().getValue()); + + SDValue VectorShuffleNode = + DAG.getVector
[llvm-branch-commits] [llvm] ed23229 - [AArch64] Fix crash caused by invalid vector element type
Author: Nicholas Guy Date: 2021-01-08T12:02:54Z New Revision: ed23229a64aed5b9d6120d57138d475291ca3667 URL: https://github.com/llvm/llvm-project/commit/ed23229a64aed5b9d6120d57138d475291ca3667 DIFF: https://github.com/llvm/llvm-project/commit/ed23229a64aed5b9d6120d57138d475291ca3667.diff LOG: [AArch64] Fix crash caused by invalid vector element type Fixes a crash caused by D91255, when LLVMTy is null when calling changeExtendedVectorElementType. Differential Revision: https://reviews.llvm.org/D94234 Added: llvm/test/CodeGen/AArch64/aarch64-dup-ext-vectortype-crash.ll Modified: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Removed: diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp index 926d952425d0..80a203b9e7ef 100644 --- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -11810,6 +11810,11 @@ static SDValue performCommonVectorExtendCombine(SDValue VectorShuffle, (PreExtendType == MVT::Other)) return SDValue(); + // Restrict valid pre-extend data type + if (PreExtendType != MVT::i8 && PreExtendType != MVT::i16 && + PreExtendType != MVT::i32) +return SDValue(); + EVT PreExtendVT = TargetType.changeVectorElementType(PreExtendType); if (PreExtendVT.getVectorElementCount() != TargetType.getVectorElementCount()) diff --git a/llvm/test/CodeGen/AArch64/aarch64-dup-ext-vectortype-crash.ll b/llvm/test/CodeGen/AArch64/aarch64-dup-ext-vectortype-crash.ll new file mode 100644 index ..995d9a19e543 --- /dev/null +++ b/llvm/test/CodeGen/AArch64/aarch64-dup-ext-vectortype-crash.ll @@ -0,0 +1,16 @@ +; RUN: llc < %s -mtriple aarch64-none-linux-gnu | FileCheck %s + +; This test covers a case where extended value types can't be converted to +; vector types, resulting in a crash. We don't care about the specific output +; here, only that this case no longer causes said crash. +; See https://reviews.llvm.org/D91255#2484399 for context +define <8 x i16> @extend_i7_v8i16(i7 %src, <8 x i8> %b) { +; CHECK-LABEL: extend_i7_v8i16: +entry: +%in = sext i7 %src to i16 +%ext.b = sext <8 x i8> %b to <8 x i16> +%broadcast.splatinsert = insertelement <8 x i16> undef, i16 %in, i16 0 +%broadcast.splat = shufflevector <8 x i16> %broadcast.splatinsert, <8 x i16> undef, <8 x i32> zeroinitializer +%out = mul nsw <8 x i16> %broadcast.splat, %ext.b +ret <8 x i16> %out +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)
@@ -5026,10 +5026,23 @@ calculateRegisterUsage(VPlan &Plan, ArrayRef VFs, // even in the scalar case. RegUsage[ClassID] += 1; } else { +// The output from scaled phis and scaled reductions actually have +// fewer lanes than the VF. +auto VF = VFs[J]; +if (auto *ReductionR = dyn_cast(R)) + VF = VF.divideCoefficientBy(ReductionR->getVFScaleFactor()); +else if (auto *PartialReductionR = + dyn_cast(R)) + VF = VF.divideCoefficientBy(PartialReductionR->getScaleFactor()); +if (VF != VFs[J]) NickGuy-Arm wrote: Nit: If the condition is only used for debug output then can it be moved to inside the LLVM_DEBUG https://github.com/llvm/llvm-project/pull/133090 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)
https://github.com/NickGuy-Arm commented: Looks generally good to me so far, with a few nitpicks. https://github.com/llvm/llvm-project/pull/133090 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)
NickGuy-Arm wrote: Could you pre-commit this test, so we can see how the output changes before and after the changes in LoopVectorize.cpp https://github.com/llvm/llvm-project/pull/133090 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)
https://github.com/NickGuy-Arm edited https://github.com/llvm/llvm-project/pull/133090 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)
@@ -2031,17 +2033,19 @@ class VPReductionPHIRecipe : public VPHeaderPHIRecipe, /// scalar value. class VPPartialReductionRecipe : public VPSingleDefRecipe { unsigned Opcode; + unsigned ScaleFactor; NickGuy-Arm wrote: Nit: Could this be `VFScaleFactor` to match the equivalent in `VPReductionPHIRecipe`? https://github.com/llvm/llvm-project/pull/133090 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)
@@ -5026,10 +5026,23 @@ calculateRegisterUsage(VPlan &Plan, ArrayRef VFs, // even in the scalar case. RegUsage[ClassID] += 1; } else { +// The output from scaled phis and scaled reductions actually have +// fewer lanes than the VF. +auto VF = VFs[J]; +if (auto *ReductionR = dyn_cast(R)) NickGuy-Arm wrote: [Idle thought, feel free to ignore] I wonder if there's precedent to add a `getVFScaleFactor` or equivalent to the base recipe class (or one of the other subclasses), and allow any recipe to override it instead of explicitly checking for every type that could scale the VF. Likely not yet, and almost certainly not in this patch, but maybe something to consider in the future? https://github.com/llvm/llvm-project/pull/133090 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [LV] Fix crash when building partial reductions using types that aren't known scale factors (#136680) (PR #136863)
NickGuy-Arm wrote: I can verify that updating the test files doesn't impact the test itself. Looks to be some instruction reordering but no change to the functionality being tested, and this test passes on main without any further changes. How do we go about updating the test on this branch, as I assume we don't have commit access to llvmbot's fork. https://github.com/llvm/llvm-project/pull/136863 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits