[llvm-branch-commits] [compiler-rt] release/19.x: [AArch64][SME] Rewrite __arm_sc_memset to remove invalid instruction (#101522) (PR #101938)

2024-08-05 Thread Kerry McLaughlin via llvm-branch-commits

kmclaughlin-arm wrote:

I think this should be merged into the release branch as the __arm_ac_memset 
routine it fixes cannot be run in streaming-mode without this change.

https://github.com/llvm/llvm-project/pull/101938
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [AArch64][SME] Rewrite __arm_sc_memset to remove invalid instruction (#101522) (PR #101938)

2024-08-05 Thread Kerry McLaughlin via llvm-branch-commits

https://github.com/kmclaughlin-arm approved this pull request.


https://github.com/llvm/llvm-project/pull/101938
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#80844 (PR #80846)

2024-02-07 Thread Kerry McLaughlin via llvm-branch-commits

https://github.com/kmclaughlin-arm approved this pull request.

LGTM. This fix is a low risk change which I think should be included in the 
release branch.

https://github.com/llvm/llvm-project/pull/80846
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80137 (PR #80138)

2024-01-31 Thread Kerry McLaughlin via llvm-branch-commits

kmclaughlin-arm wrote:

> @kmclaughlin-arm What do you think about merging this PR to the release 
> branch?

I think this should be merged into the release branch, as it fixes incorrect 
inlining of `__arm_locally_streaming` functions.

https://github.com/llvm/llvm-project/pull/80138
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80137 (PR #80138)

2024-01-31 Thread Kerry McLaughlin via llvm-branch-commits

https://github.com/kmclaughlin-arm approved this pull request.


https://github.com/llvm/llvm-project/pull/80138
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#80441 (PR #80444)

2024-02-02 Thread Kerry McLaughlin via llvm-branch-commits

kmclaughlin-arm wrote:

> @kmclaughlin-arm What do you think about merging this PR to the release 
> branch?

LGTM. This is a low risk change which I think should be included in the release 
branch.

https://github.com/llvm/llvm-project/pull/80444
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#80441 (PR #80444)

2024-02-02 Thread Kerry McLaughlin via llvm-branch-commits

https://github.com/kmclaughlin-arm approved this pull request.


https://github.com/llvm/llvm-project/pull/80444
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 2170e0e - [SVE][CodeGen] CTLZ, CTTZ & CTPOP operations (predicates)

2021-01-13 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2021-01-13T12:24:54Z
New Revision: 2170e0ee60db638175a8c57230d46fbaafa06d4c

URL: 
https://github.com/llvm/llvm-project/commit/2170e0ee60db638175a8c57230d46fbaafa06d4c
DIFF: 
https://github.com/llvm/llvm-project/commit/2170e0ee60db638175a8c57230d46fbaafa06d4c.diff

LOG: [SVE][CodeGen] CTLZ, CTTZ & CTPOP operations (predicates)

Canonicalise the following operations in getNode() for predicate types:
 - CTLZ(Pred)  -> bitwise_NOT(Pred)
 - CTTZ(Pred)  -> bitwise_NOT(Pred)
 - CTPOP(Pred) -> Pred

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D94428

Added: 
llvm/test/CodeGen/AArch64/sve-bit-counting-pred.ll

Modified: 
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

Removed: 




diff  --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index c4f6e89006c1..e080408bbe42 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -4796,6 +4796,15 @@ SDValue SelectionDAG::getNode(unsigned Opcode, const 
SDLoc &DL, EVT VT,
   case ISD::VSCALE:
 assert(VT == Operand.getValueType() && "Unexpected VT!");
 break;
+  case ISD::CTPOP:
+if (Operand.getValueType().getScalarType() == MVT::i1)
+  return Operand;
+break;
+  case ISD::CTLZ:
+  case ISD::CTTZ:
+if (Operand.getValueType().getScalarType() == MVT::i1)
+  return getNOT(DL, Operand, Operand.getValueType());
+break;
   case ISD::VECREDUCE_SMIN:
   case ISD::VECREDUCE_UMAX:
 if (Operand.getValueType().getScalarType() == MVT::i1)

diff  --git a/llvm/test/CodeGen/AArch64/sve-bit-counting-pred.ll 
b/llvm/test/CodeGen/AArch64/sve-bit-counting-pred.ll
new file mode 100644
index ..73c555d98943
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/sve-bit-counting-pred.ll
@@ -0,0 +1,141 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
+; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
+
+; If this check fails please read test/CodeGen/AArch64/README for instructions 
on how to resolve it.
+; WARN-NOT: warning
+
+;
+; CTPOP
+;
+
+define  @ctpop_nxv16i1( %a) {
+; CHECK-LABEL: ctpop_nxv16i1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %res = call  @llvm.ctpop.nxv16i1( %a)
+  ret  %res
+}
+
+define  @ctpop_nxv8i1( %a) {
+; CHECK-LABEL: ctpop_nxv8i1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %res = call  @llvm.ctpop.nxv8i1( %a)
+  ret  %res
+}
+
+define  @ctpop_nxv4i1( %a) {
+; CHECK-LABEL: ctpop_nxv4i1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %res = call  @llvm.ctpop.nxv4i1( %a)
+  ret  %res
+}
+
+define  @ctpop_nxv2i1( %a) {
+; CHECK-LABEL: ctpop_nxv2i1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ret
+  %res = call  @llvm.ctpop.nxv2i1( %a)
+  ret  %res
+}
+
+; CTLZ
+
+define  @ctlz_nxv16i1( %a) {
+; CHECK-LABEL: ctlz_nxv16i1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p1.b
+; CHECK-NEXT:not p0.b, p1/z, p0.b
+; CHECK-NEXT:ret
+  %res = call  @llvm.ctlz.nxv16i1( %a)
+  ret  %res
+}
+
+define  @ctlz_nxv8i1( %a) {
+; CHECK-LABEL: ctlz_nxv8i1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p1.h
+; CHECK-NEXT:not p0.b, p1/z, p0.b
+; CHECK-NEXT:ret
+  %res = call  @llvm.ctlz.nxv8i1( %a)
+  ret  %res
+}
+
+define  @ctlz_nxv4i1( %a) {
+; CHECK-LABEL: ctlz_nxv4i1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p1.s
+; CHECK-NEXT:not p0.b, p1/z, p0.b
+; CHECK-NEXT:ret
+  %res = call  @llvm.ctlz.nxv4i1( %a)
+  ret  %res
+}
+
+define  @ctlz_nxv2i1( %a) {
+; CHECK-LABEL: ctlz_nxv2i1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p1.d
+; CHECK-NEXT:not p0.b, p1/z, p0.b
+; CHECK-NEXT:ret
+  %res = call  @llvm.ctlz.nxv2i1( %a)
+  ret  %res
+}
+
+; CTTZ
+
+define  @cttz_nxv16i1( %a) {
+; CHECK-LABEL: cttz_nxv16i1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p1.b
+; CHECK-NEXT:not p0.b, p1/z, p0.b
+; CHECK-NEXT:ret
+  %res = call  @llvm.cttz.nxv16i1( %a)
+  ret  %res
+}
+
+define  @cttz_nxv8i1( %a) {
+; CHECK-LABEL: cttz_nxv8i1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p1.h
+; CHECK-NEXT:not p0.b, p1/z, p0.b
+; CHECK-NEXT:ret
+  %res = call  @llvm.cttz.nxv8i1( %a)
+  ret  %res
+}
+
+define  @cttz_nxv4i1( %a) {
+; CHECK-LABEL: cttz_nxv4i1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p1.s
+; CHECK-NEXT:not p0.b, p1/z, p0.b
+; CHECK-NEXT:ret
+  %res = call  @llvm.cttz.nxv4i1( %a)
+  ret  %res
+}
+
+define  @cttz_nxv2i1( %a) {
+; CHECK-LABEL: cttz_nxv2i1:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ptrue p1.d
+; CHECK-NEXT:not p0.b, p1/z, p0.b
+; CHECK-NEXT:ret
+  %res = call  @llvm.cttz.nxv2i1( %a)
+  ret  %res
+}
+
+declare  @llvm.ctpop.nxv16i1()
+declare  @llvm.ctpop.nxv8i1()
+declare  @llvm.ctpop.nxv4i1()
+declare  @llvm.ctpop.nxv2i1()
+
+declare  @llvm.ctlz.nx

[llvm-branch-commits] [llvm] c37f68a - [SVE][CodeGen] Fix legalisation of floating-point masked gathers

2021-01-11 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2021-01-11T10:57:46Z
New Revision: c37f68a8885cf55e9a6603613a918c4e7474e9af

URL: 
https://github.com/llvm/llvm-project/commit/c37f68a8885cf55e9a6603613a918c4e7474e9af
DIFF: 
https://github.com/llvm/llvm-project/commit/c37f68a8885cf55e9a6603613a918c4e7474e9af.diff

LOG: [SVE][CodeGen] Fix legalisation of floating-point masked gathers

Changes in this patch:
- When lowering floating-point masked gathers, cast the result of the
  gather back to the original type with reinterpret_cast before returning.
- Added patterns for reinterpret_casts from integer to floating point, and
  concat_vector patterns for bfloat16.
- Tests for various legalisation scenarios with floating point types.

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D94171

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
llvm/test/CodeGen/AArch64/sve-masked-gather-legalize.ll

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 1dff16234bbd..b4cb62cd5348 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -1167,6 +1167,7 @@ AArch64TargetLowering::AArch64TargetLowering(const 
TargetMachine &TM,
 }
 
 for (auto VT : {MVT::nxv2bf16, MVT::nxv4bf16, MVT::nxv8bf16}) {
+  setOperationAction(ISD::CONCAT_VECTORS, VT, Custom);
   setOperationAction(ISD::MGATHER, VT, Custom);
   setOperationAction(ISD::MSCATTER, VT, Custom);
 }
@@ -3990,7 +3991,6 @@ SDValue AArch64TargetLowering::LowerMGATHER(SDValue Op,
 
   // Handle FP data
   if (VT.isFloatingPoint()) {
-VT = VT.changeVectorElementTypeToInteger();
 ElementCount EC = VT.getVectorElementCount();
 auto ScalarIntVT =
 MVT::getIntegerVT(AArch64::SVEBitsPerBlock / EC.getKnownMinValue());
@@ -4013,7 +4013,14 @@ SDValue AArch64TargetLowering::LowerMGATHER(SDValue Op,
 Opcode = getSignExtendedGatherOpcode(Opcode);
 
   SDValue Ops[] = {Chain, Mask, BasePtr, Index, InputVT, PassThru};
-  return DAG.getNode(Opcode, DL, VTs, Ops);
+  SDValue Gather = DAG.getNode(Opcode, DL, VTs, Ops);
+
+  if (VT.isFloatingPoint()) {
+SDValue Cast = DAG.getNode(AArch64ISD::REINTERPRET_CAST, DL, VT, Gather);
+return DAG.getMergeValues({Cast, Gather}, DL);
+  }
+
+  return Gather;
 }
 
 SDValue AArch64TargetLowering::LowerMSCATTER(SDValue Op,

diff  --git a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td 
b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
index f5ccbee0f232..50368199effb 100644
--- a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -1183,6 +1183,10 @@ let Predicates = [HasSVE] in {
 (UZP1_ZZZ_H $v1, $v2)>;
   def : Pat<(nxv4f32 (concat_vectors nxv2f32:$v1, nxv2f32:$v2)),
 (UZP1_ZZZ_S $v1, $v2)>;
+  def : Pat<(nxv4bf16 (concat_vectors nxv2bf16:$v1, nxv2bf16:$v2)),
+(UZP1_ZZZ_S $v1, $v2)>;
+  def : Pat<(nxv8bf16 (concat_vectors nxv4bf16:$v1, nxv4bf16:$v2)),
+(UZP1_ZZZ_H $v1, $v2)>;
 
   defm CMPHS_PPzZZ : sve_int_cmp_0<0b000, "cmphs", SETUGE, SETULE>;
   defm CMPHI_PPzZZ : sve_int_cmp_0<0b001, "cmphi", SETUGT, SETULT>;
@@ -1736,6 +1740,16 @@ let Predicates = [HasSVE] in {
   def : Pat<(nxv2i64 (reinterpret_cast (nxv2bf16 ZPR:$src))), 
(COPY_TO_REGCLASS ZPR:$src, ZPR)>;
   def : Pat<(nxv4i32 (reinterpret_cast (nxv4bf16 ZPR:$src))), 
(COPY_TO_REGCLASS ZPR:$src, ZPR)>;
 
+  def : Pat<(nxv2f16 (reinterpret_cast (nxv2i64 ZPR:$src))), (COPY_TO_REGCLASS 
ZPR:$src, ZPR)>;
+  def : Pat<(nxv2f32 (reinterpret_cast (nxv2i64 ZPR:$src))), (COPY_TO_REGCLASS 
ZPR:$src, ZPR)>;
+  def : Pat<(nxv2f64 (reinterpret_cast (nxv2i64 ZPR:$src))), (COPY_TO_REGCLASS 
ZPR:$src, ZPR)>;
+  def : Pat<(nxv4f16 (reinterpret_cast (nxv4i32 ZPR:$src))), (COPY_TO_REGCLASS 
ZPR:$src, ZPR)>;
+  def : Pat<(nxv4f32 (reinterpret_cast (nxv4i32 ZPR:$src))), (COPY_TO_REGCLASS 
ZPR:$src, ZPR)>;
+  def : Pat<(nxv8f16 (reinterpret_cast (nxv8i16 ZPR:$src))), (COPY_TO_REGCLASS 
ZPR:$src, ZPR)>;
+  def : Pat<(nxv2bf16 (reinterpret_cast (nxv2i64 ZPR:$src))), 
(COPY_TO_REGCLASS ZPR:$src, ZPR)>;
+  def : Pat<(nxv4bf16 (reinterpret_cast (nxv4i32 ZPR:$src))), 
(COPY_TO_REGCLASS ZPR:$src, ZPR)>;
+  def : Pat<(nxv8bf16 (reinterpret_cast (nxv8i16 ZPR:$src))), 
(COPY_TO_REGCLASS ZPR:$src, ZPR)>;
+
   def : Pat<(nxv16i1 (and PPR:$Ps1, PPR:$Ps2)),
 (AND_PPzPP (PTRUE_B 31), PPR:$Ps1, PPR:$Ps2)>;
   def : Pat<(nxv8i1 (and PPR:$Ps1, PPR:$Ps2)),

diff  --git a/llvm/test/CodeGen/AArch64/sve-masked-gather-legalize.ll 
b/llvm/test/CodeGen/AArch64/sve-masked-gather-legalize.ll
index 6b1dc031dbb2..e71ec8178034 100644
--- a/llvm/test/CodeGen/AArch64/sve-masked-gather-legalize.ll
+++ b/llvm/test/CodeGen/AArch64/sve-masked-gather-legalize.ll
@@ -71,6 +

[llvm-branch-commits] [llvm] 6d2a789 - [SVE][CodeGen] Add bfloat16 support to scalable masked gather

2020-12-17 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2020-12-17T11:08:15Z
New Revision: 6d2a78996bee74611dad55b6c42b828ce1ee0953

URL: 
https://github.com/llvm/llvm-project/commit/6d2a78996bee74611dad55b6c42b828ce1ee0953
DIFF: 
https://github.com/llvm/llvm-project/commit/6d2a78996bee74611dad55b6c42b828ce1ee0953.diff

LOG: [SVE][CodeGen] Add bfloat16 support to scalable masked gather

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D93307

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-scaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-unscaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-unsigned-scaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-unsigned-unscaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-64b-scaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-64b-unscaled.ll
llvm/test/CodeGen/AArch64/sve-masked-scatter-legalise.ll

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index e4d1b514b776..9eeacc8df0bf 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -1151,8 +1151,10 @@ AArch64TargetLowering::AArch64TargetLowering(const 
TargetMachine &TM,
   setOperationAction(ISD::VECREDUCE_SEQ_FADD, VT, Custom);
 }
 
-for (auto VT : {MVT::nxv2bf16, MVT::nxv4bf16, MVT::nxv8bf16})
+for (auto VT : {MVT::nxv2bf16, MVT::nxv4bf16, MVT::nxv8bf16}) {
+  setOperationAction(ISD::MGATHER, VT, Custom);
   setOperationAction(ISD::MSCATTER, VT, Custom);
+}
 
 setOperationAction(ISD::SPLAT_VECTOR, MVT::nxv8bf16, Custom);
 

diff  --git a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td 
b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
index adbace24ee6c..fbe24460d51f 100644
--- a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -1196,6 +1196,10 @@ let Predicates = [HasSVE] in {
   (UUNPKLO_ZZ_D ZPR:$Zs)>;
 def : Pat<(nxv2bf16 (extract_subvector (nxv4bf16 ZPR:$Zs), (i64 2))),
   (UUNPKHI_ZZ_D ZPR:$Zs)>;
+def : Pat<(nxv4bf16 (extract_subvector (nxv8bf16 ZPR:$Zs), (i64 0))),
+  (UUNPKLO_ZZ_S ZPR:$Zs)>;
+def : Pat<(nxv4bf16 (extract_subvector (nxv8bf16 ZPR:$Zs), (i64 4))),
+  (UUNPKHI_ZZ_S ZPR:$Zs)>;
   }
 
   def : Pat<(nxv4f16 (extract_subvector (nxv8f16 ZPR:$Zs), (i64 0))),

diff  --git a/llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-scaled.ll 
b/llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-scaled.ll
index e6b89b0070d6..25d0a471c29a 100644
--- a/llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-scaled.ll
+++ b/llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-scaled.ll
@@ -48,6 +48,16 @@ define  @masked_gather_nxv2f16(half* 
%base,  %vals
 }
 
+define  @masked_gather_nxv2bf16(bfloat* %base,  %offsets,  %mask) #0 {
+; CHECK-LABEL: masked_gather_nxv2bf16:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ld1h { z0.d }, p0/z, [x0, z0.d, sxtw #1]
+; CHECK-NEXT:ret
+  %ptrs = getelementptr bfloat, bfloat* %base,  %offsets
+  %vals = call  @llvm.masked.gather.nxv2bf16( %ptrs, i32 2,  %mask,  undef)
+  ret  %vals
+}
+
 define  @masked_gather_nxv2f32(float* %base,  %offsets,  %mask) {
 ; CHECK-LABEL: masked_gather_nxv2f32:
 ; CHECK:   // %bb.0:
@@ -125,6 +135,16 @@ define  @masked_gather_nxv4f16(half* 
%base,  %vals
 }
 
+define  @masked_gather_nxv4bf16(bfloat* %base,  %offsets,  %mask) #0 {
+; CHECK-LABEL: masked_gather_nxv4bf16:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ld1h { z0.s }, p0/z, [x0, z0.s, sxtw #1]
+; CHECK-NEXT:ret
+  %ptrs = getelementptr bfloat, bfloat* %base,  %offsets
+  %vals = call  @llvm.masked.gather.nxv4bf16( %ptrs, i32 2,  %mask,  undef)
+  ret  %vals
+}
+
 define  @masked_gather_nxv4f32(float* %base,  %offsets,  %mask) {
 ; CHECK-LABEL: masked_gather_nxv4f32:
 ; CHECK:   // %bb.0:
@@ -150,10 +170,13 @@ declare  
@llvm.masked.gather.nxv2i16(, i32,
 declare  @llvm.masked.gather.nxv2i32(, 
i32, , )
 declare  @llvm.masked.gather.nxv2i64(, 
i32, , )
 declare  @llvm.masked.gather.nxv2f16(, 
i32, , )
+declare  @llvm.masked.gather.nxv2bf16(, i32, , )
 declare  @llvm.masked.gather.nxv2f32(, i32, , )
 declare  @llvm.masked.gather.nxv2f64(, i32, , )
 
 declare  @llvm.masked.gather.nxv4i16(, 
i32, , )
 declare  @llvm.masked.gather.nxv4i32(, 
i32, , )
 declare  @llvm.masked.gather.nxv4f16(, 
i32, , )
+declare  @llvm.masked.gather.nxv4bf16(, i32, , )
 declare  @llvm.masked.gather.nxv4f32(, i32, , )
+attributes #0 = { "target-features"="+sve,+bf16" }

diff  --git 
a/llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-unscaled.ll 
b/llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-unscaled.ll
index 2d4ce50e8464..b9bf9049d46f

[llvm-branch-commits] [llvm] 7c504b6 - [AArch64] Renamed sve-masked-scatter-legalise.ll. NFC.

2020-12-17 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2020-12-17T11:40:09Z
New Revision: 7c504b6dd0638c4bad40440060fdebc726dc0c07

URL: 
https://github.com/llvm/llvm-project/commit/7c504b6dd0638c4bad40440060fdebc726dc0c07
DIFF: 
https://github.com/llvm/llvm-project/commit/7c504b6dd0638c4bad40440060fdebc726dc0c07.diff

LOG: [AArch64] Renamed sve-masked-scatter-legalise.ll. NFC.

Added: 
llvm/test/CodeGen/AArch64/sve-masked-scatter-legalize.ll

Modified: 


Removed: 
llvm/test/CodeGen/AArch64/sve-masked-scatter-legalise.ll



diff  --git a/llvm/test/CodeGen/AArch64/sve-masked-scatter-legalise.ll 
b/llvm/test/CodeGen/AArch64/sve-masked-scatter-legalize.ll
similarity index 100%
rename from llvm/test/CodeGen/AArch64/sve-masked-scatter-legalise.ll
rename to llvm/test/CodeGen/AArch64/sve-masked-scatter-legalize.ll



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 52e4084 - [SVE][CodeGen] Vector + immediate addressing mode for masked gather/scatter

2020-12-18 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2020-12-18T11:56:36Z
New Revision: 52e4084d9c3b15dbb73906f28f7f5aa45b835b64

URL: 
https://github.com/llvm/llvm-project/commit/52e4084d9c3b15dbb73906f28f7f5aa45b835b64
DIFF: 
https://github.com/llvm/llvm-project/commit/52e4084d9c3b15dbb73906f28f7f5aa45b835b64.diff

LOG: [SVE][CodeGen] Vector + immediate addressing mode for masked gather/scatter

This patch extends LowerMGATHER/MSCATTER to make use of the vector + 
reg/immediate
addressing modes for scalable masked gathers & scatters.

selectGatherScatterAddrMode checks if the base pointer is null, in which case
we can swap the base pointer and the index, e.g.
 getelementptr nullptr,  (splat(%offset)) + %indices)
  -> getelementptr %offset,  %indices

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D93132

Added: 
llvm/test/CodeGen/AArch64/sve-masked-gather-vec-plus-imm.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-vec-plus-reg.ll
llvm/test/CodeGen/AArch64/sve-masked-gather.ll
llvm/test/CodeGen/AArch64/sve-masked-scatter-vec-plus-imm.ll
llvm/test/CodeGen/AArch64/sve-masked-scatter-vec-plus-reg.ll
llvm/test/CodeGen/AArch64/sve-masked-scatter.ll

Modified: 
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/test/CodeGen/AArch64/sve-masked-gather-legalize.ll

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 9eeacc8df0bf..43db745d6328 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -3812,6 +3812,8 @@ unsigned getSignExtendedGatherOpcode(unsigned Opcode) {
 return Opcode;
   case AArch64ISD::GLD1_MERGE_ZERO:
 return AArch64ISD::GLD1S_MERGE_ZERO;
+  case AArch64ISD::GLD1_IMM_MERGE_ZERO:
+return AArch64ISD::GLD1S_IMM_MERGE_ZERO;
   case AArch64ISD::GLD1_UXTW_MERGE_ZERO:
 return AArch64ISD::GLD1S_UXTW_MERGE_ZERO;
   case AArch64ISD::GLD1_SXTW_MERGE_ZERO:
@@ -3843,6 +3845,60 @@ bool getGatherScatterIndexIsExtended(SDValue Index) {
   return false;
 }
 
+// If the base pointer of a masked gather or scatter is null, we
+// may be able to swap BasePtr & Index and use the vector + register
+// or vector + immediate addressing mode, e.g.
+// VECTOR + REGISTER:
+//getelementptr nullptr,  (splat(%offset)) + %indices)
+// -> getelementptr %offset,  %indices
+// VECTOR + IMMEDIATE:
+//getelementptr nullptr,  (splat(#x)) + %indices)
+// -> getelementptr #x,  %indices
+void selectGatherScatterAddrMode(SDValue &BasePtr, SDValue &Index, EVT MemVT,
+ unsigned &Opcode, bool IsGather,
+ SelectionDAG &DAG) {
+  if (!isNullConstant(BasePtr))
+return;
+
+  ConstantSDNode *Offset = nullptr;
+  if (Index.getOpcode() == ISD::ADD)
+if (auto SplatVal = DAG.getSplatValue(Index.getOperand(1))) {
+  if (isa(SplatVal))
+Offset = cast(SplatVal);
+  else {
+BasePtr = SplatVal;
+Index = Index->getOperand(0);
+return;
+  }
+}
+
+  unsigned NewOp =
+  IsGather ? AArch64ISD::GLD1_IMM_MERGE_ZERO : AArch64ISD::SST1_IMM_PRED;
+
+  if (!Offset) {
+std::swap(BasePtr, Index);
+Opcode = NewOp;
+return;
+  }
+
+  uint64_t OffsetVal = Offset->getZExtValue();
+  unsigned ScalarSizeInBytes = MemVT.getScalarSizeInBits() / 8;
+  auto ConstOffset = DAG.getConstant(OffsetVal, SDLoc(Index), MVT::i64);
+
+  if (OffsetVal % ScalarSizeInBytes || OffsetVal / ScalarSizeInBytes > 31) {
+// Index is out of range for the immediate addressing mode
+BasePtr = ConstOffset;
+Index = Index->getOperand(0);
+return;
+  }
+
+  // Immediate is in range
+  Opcode = NewOp;
+  BasePtr = Index->getOperand(0);
+  Index = ConstOffset;
+  return;
+}
+
 SDValue AArch64TargetLowering::LowerMGATHER(SDValue Op,
 SelectionDAG &DAG) const {
   SDLoc DL(Op);
@@ -3892,6 +3948,9 @@ SDValue AArch64TargetLowering::LowerMGATHER(SDValue Op,
 Index = Index.getOperand(0);
 
   unsigned Opcode = getGatherVecOpcode(IsScaled, IsSigned, IdxNeedsExtend);
+  selectGatherScatterAddrMode(BasePtr, Index, MemVT, Opcode,
+  /*isGather=*/true, DAG);
+
   if (ResNeedsSignExtend)
 Opcode = getSignExtendedGatherOpcode(Opcode);
 
@@ -3944,9 +4003,12 @@ SDValue AArch64TargetLowering::LowerMSCATTER(SDValue Op,
   if (getGatherScatterIndexIsExtended(Index))
 Index = Index.getOperand(0);
 
+  unsigned Opcode = getScatterVecOpcode(IsScaled, IsSigned, NeedsExtend);
+  selectGatherScatterAddrMode(BasePtr, Index, MemVT, Opcode,
+  /*isGather=*/false, DAG);
+
   SDValue Ops[] = {Chain, StoreVal, Mask, BasePtr, Index, InputVT};
-  return DAG.getNode(getScatterVecOpcode(IsScaled, IsSigned, NeedsExtend), DL,
- VTs, Ops);
+  return DAG.getNode(Opcode, D

[llvm-branch-commits] [llvm] d3a0f9b - [APInt] Add the truncOrSelf resizing operator to APInt

2020-11-23 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2020-11-23T11:27:30Z
New Revision: d3a0f9b9ec88ce0737470652330262f8ed46daa7

URL: 
https://github.com/llvm/llvm-project/commit/d3a0f9b9ec88ce0737470652330262f8ed46daa7
DIFF: 
https://github.com/llvm/llvm-project/commit/d3a0f9b9ec88ce0737470652330262f8ed46daa7.diff

LOG: [APInt] Add the truncOrSelf resizing operator to APInt

Truncates the APInt if the bit width is greater than the width specified,
otherwise do nothing

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D91445

Added: 


Modified: 
llvm/include/llvm/ADT/APInt.h
llvm/lib/Support/APInt.cpp
llvm/unittests/ADT/APIntTest.cpp

Removed: 




diff  --git a/llvm/include/llvm/ADT/APInt.h b/llvm/include/llvm/ADT/APInt.h
index f5860f6c7517..b97ea2cd9aee 100644
--- a/llvm/include/llvm/ADT/APInt.h
+++ b/llvm/include/llvm/ADT/APInt.h
@@ -1403,6 +1403,12 @@ class LLVM_NODISCARD APInt {
   /// extended, truncated, or left alone to make it that width.
   APInt zextOrTrunc(unsigned width) const;
 
+  /// Truncate to width
+  ///
+  /// Make this APInt have the bit width given by \p width. The value is
+  /// truncated or left alone to make it that width.
+  APInt truncOrSelf(unsigned width) const;
+
   /// Sign extend or truncate to width
   ///
   /// Make this APInt have the bit width given by \p width. The value is sign

diff  --git a/llvm/lib/Support/APInt.cpp b/llvm/lib/Support/APInt.cpp
index fc339de45af4..12ceb2df112e 100644
--- a/llvm/lib/Support/APInt.cpp
+++ b/llvm/lib/Support/APInt.cpp
@@ -961,6 +961,12 @@ APInt APInt::sextOrTrunc(unsigned width) const {
   return *this;
 }
 
+APInt APInt::truncOrSelf(unsigned width) const {
+  if (BitWidth > width)
+return trunc(width);
+  return *this;
+}
+
 APInt APInt::zextOrSelf(unsigned width) const {
   if (BitWidth < width)
 return zext(width);

diff  --git a/llvm/unittests/ADT/APIntTest.cpp 
b/llvm/unittests/ADT/APIntTest.cpp
index 673a2110af09..ef5423e332e1 100644
--- a/llvm/unittests/ADT/APIntTest.cpp
+++ b/llvm/unittests/ADT/APIntTest.cpp
@@ -2598,6 +2598,13 @@ TEST(APIntTest, sext) {
   EXPECT_EQ(63U, i32_neg1.countPopulation());
 }
 
+TEST(APIntTest, truncOrSelf) {
+  APInt val(32, 0x);
+  EXPECT_EQ(0x, val.truncOrSelf(16));
+  EXPECT_EQ(0x, val.truncOrSelf(32));
+  EXPECT_EQ(0x, val.truncOrSelf(64));
+}
+
 TEST(APIntTest, multiply) {
   APInt i64(64, 1234);
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 603d40d - [SVE][CodeGen] Add a DAG combine to extend mscatter indices

2020-11-25 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2020-11-25T11:18:22Z
New Revision: 603d40da9d532ab4706e32c07aba339e180ed865

URL: 
https://github.com/llvm/llvm-project/commit/603d40da9d532ab4706e32c07aba339e180ed865
DIFF: 
https://github.com/llvm/llvm-project/commit/603d40da9d532ab4706e32c07aba339e180ed865.diff

LOG: [SVE][CodeGen] Add a DAG combine to extend mscatter indices

This patch adds a target-specific DAG combine for mscatter to promote indices
with element types i8 or i16 before legalisation, plus various tests with 
illegal types.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D90945

Added: 
llvm/test/CodeGen/AArch64/sve-masked-scatter-legalise.ll

Modified: 
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index b92eb1d0e4f6..e4c20cc4e6e3 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -835,6 +835,8 @@ AArch64TargetLowering::AArch64TargetLowering(const 
TargetMachine &TM,
   if (Subtarget->supportsAddressTopByteIgnored())
 setTargetDAGCombine(ISD::LOAD);
 
+  setTargetDAGCombine(ISD::MSCATTER);
+
   setTargetDAGCombine(ISD::MUL);
 
   setTargetDAGCombine(ISD::SELECT);
@@ -13944,6 +13946,44 @@ static SDValue performSTORECombine(SDNode *N,
   return SDValue();
 }
 
+static SDValue performMSCATTERCombine(SDNode *N,
+  TargetLowering::DAGCombinerInfo &DCI,
+  SelectionDAG &DAG) {
+  MaskedScatterSDNode *MSC = cast(N);
+  assert(MSC && "Can only combine scatter store nodes");
+
+  SDLoc DL(MSC);
+  SDValue Chain = MSC->getChain();
+  SDValue Scale = MSC->getScale();
+  SDValue Index = MSC->getIndex();
+  SDValue Data = MSC->getValue();
+  SDValue Mask = MSC->getMask();
+  SDValue BasePtr = MSC->getBasePtr();
+  ISD::MemIndexType IndexType = MSC->getIndexType();
+
+  EVT IdxVT = Index.getValueType();
+
+  if (DCI.isBeforeLegalize()) {
+// SVE gather/scatter requires indices of i32/i64. Promote anything smaller
+// prior to legalisation so the result can be split if required.
+if ((IdxVT.getVectorElementType() == MVT::i8) ||
+(IdxVT.getVectorElementType() == MVT::i16)) {
+  EVT NewIdxVT = IdxVT.changeVectorElementType(MVT::i32);
+  if (MSC->isIndexSigned())
+Index = DAG.getNode(ISD::SIGN_EXTEND, DL, NewIdxVT, Index);
+  else
+Index = DAG.getNode(ISD::ZERO_EXTEND, DL, NewIdxVT, Index);
+
+  SDValue Ops[] = { Chain, Data, Mask, BasePtr, Index, Scale };
+  return DAG.getMaskedScatter(DAG.getVTList(MVT::Other),
+  MSC->getMemoryVT(), DL, Ops,
+  MSC->getMemOperand(), IndexType,
+  MSC->isTruncatingStore());
+}
+  }
+
+  return SDValue();
+}
 
 /// Target-specific DAG combine function for NEON load/store intrinsics
 /// to merge base address updates.
@@ -15136,6 +15176,8 @@ SDValue AArch64TargetLowering::PerformDAGCombine(SDNode 
*N,
 break;
   case ISD::STORE:
 return performSTORECombine(N, DCI, DAG, Subtarget);
+  case ISD::MSCATTER:
+return performMSCATTERCombine(N, DCI, DAG);
   case AArch64ISD::BRCOND:
 return performBRCONDCombine(N, DCI, DAG);
   case AArch64ISD::TBNZ:

diff  --git a/llvm/test/CodeGen/AArch64/sve-masked-scatter-legalise.ll 
b/llvm/test/CodeGen/AArch64/sve-masked-scatter-legalise.ll
new file mode 100644
index ..c3746a61d875
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/sve-masked-scatter-legalise.ll
@@ -0,0 +1,59 @@
+; RUN: llc -mtriple=aarch64--linux-gnu -mattr=+sve < %s | FileCheck %s
+
+; Tests that exercise various type legalisation scenarios for ISD::MSCATTER.
+
+; Code generate the scenario where the offset vector type is illegal.
+define void @masked_scatter_nxv16i8( %data, i8* %base, 
 %offsets,  %mask) {
+; CHECK-LABEL: masked_scatter_nxv16i8:
+; CHECK-DAG: st1b { {{z[0-9]+}}.s }, {{p[0-9]+}}, [x0, {{z[0-9]+}}.s, sxtw]
+; CHECK-DAG: st1b { {{z[0-9]+}}.s }, {{p[0-9]+}}, [x0, {{z[0-9]+}}.s, sxtw]
+; CHECK-DAG: st1b { {{z[0-9]+}}.s }, {{p[0-9]+}}, [x0, {{z[0-9]+}}.s, sxtw]
+; CHECK-DAG: st1b { {{z[0-9]+}}.s }, {{p[0-9]+}}, [x0, {{z[0-9]+}}.s, sxtw]
+; CHECK: ret
+  %ptrs = getelementptr i8, i8* %base,  %offsets
+  call void @llvm.masked.scatter.nxv16i8( %data,  %ptrs, i32 1,  %mask)
+  ret void
+}
+
+define void @masked_scatter_nxv8i16( %data, i16* %base, 
 %offsets,  %mask) {
+; CHECK-LABEL: masked_scatter_nxv8i16
+; CHECK-DAG: st1h { {{z[0-9]+}}.s }, {{p[0-9]+}}, [x0, {{z[0-9]+}}.s, sxtw #1]
+; CHECK-DAG: st1h { {{z[0-9]+}}.s }, {{p[0-9]+}}, [x0, {{z[0-9]+}}.s, sxtw #1]
+; CHECK: ret
+  %ptrs = getelementptr i16, i16* %base,  %offsets
+  call void @llvm.masked.scatter.nxv8i16( %data,  %ptrs, i32 1,  %mask)
+  ret void
+}
+
+def

[llvm-branch-commits] [llvm] 4bee319 - [SVE][CodeGen] Extend isConstantSplatValue to support ISD::SPLAT_VECTOR

2020-11-26 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2020-11-26T11:19:40Z
New Revision: 4bee3197f665a8c2336a6cdd4bf5c4575b9e5fe7

URL: 
https://github.com/llvm/llvm-project/commit/4bee3197f665a8c2336a6cdd4bf5c4575b9e5fe7
DIFF: 
https://github.com/llvm/llvm-project/commit/4bee3197f665a8c2336a6cdd4bf5c4575b9e5fe7.diff

LOG: [SVE][CodeGen] Extend isConstantSplatValue to support ISD::SPLAT_VECTOR

Updated the affected scalable_of_scalable tests in sve-gep.ll, as 
isConstantSplatValue now returns true in DAGCombiner::visitMUL and folds `(mul 
x, 1) -> x`

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D91363

Added: 


Modified: 
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
llvm/test/CodeGen/AArch64/sve-gep.ll

Removed: 




diff  --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index eee80cc4bc70..20e4ac590136 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -139,6 +139,15 @@ bool ConstantFPSDNode::isValueValidForType(EVT VT,
 
//===--===//
 
 bool ISD::isConstantSplatVector(const SDNode *N, APInt &SplatVal) {
+  if (N->getOpcode() == ISD::SPLAT_VECTOR) {
+unsigned EltSize =
+N->getValueType(0).getVectorElementType().getSizeInBits();
+if (auto *Op0 = dyn_cast(N->getOperand(0))) {
+  SplatVal = Op0->getAPIntValue().truncOrSelf(EltSize);
+  return true;
+}
+  }
+
   auto *BV = dyn_cast(N);
   if (!BV)
 return false;

diff  --git a/llvm/test/CodeGen/AArch64/sve-gep.ll 
b/llvm/test/CodeGen/AArch64/sve-gep.ll
index 8f68a38e2cd2..ffde9289a55d 100644
--- a/llvm/test/CodeGen/AArch64/sve-gep.ll
+++ b/llvm/test/CodeGen/AArch64/sve-gep.ll
@@ -105,11 +105,9 @@ define *> 
@scalable_of_scalable_1( insertelement ( 
undef, i64 1, i32 0),  zeroinitializer,  
zeroinitializer
   %d = getelementptr , * %base,  %idx
@@ -120,10 +118,8 @@ define *> 
@scalable_of_scalable_2( insertelement ( 
undef, i64 1, i32 0),  zeroinitializer,  
zeroinitializer
   %d = getelementptr , *> 
%base,  %idx



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] f6dd32f - [SVE][CodeGen] Lower scalable masked gathers

2020-12-07 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2020-12-07T12:20:41Z
New Revision: f6dd32fd3584380730a09b042cfbac852f36eb00

URL: 
https://github.com/llvm/llvm-project/commit/f6dd32fd3584380730a09b042cfbac852f36eb00
DIFF: 
https://github.com/llvm/llvm-project/commit/f6dd32fd3584380730a09b042cfbac852f36eb00.diff

LOG: [SVE][CodeGen] Lower scalable masked gathers

Lowers the llvm.masked.gather intrinsics (scalar plus vector addressing mode 
only)

Changes in this patch:
- Add custom lowering for MGATHER, using getGatherVecOpcode() to choose the 
appropriate
  gather load opcode to use.
- Improve codegen with refineIndexType/refineUniformBase, added in D90942
- Tests added for gather loads with 32 & 64-bit scaled & unscaled offsets.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D91092

Added: 
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-scaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-unscaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-unsigned-scaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-unsigned-unscaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-64b-scaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-64b-unscaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-legalize.ll

Modified: 
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/lib/Target/AArch64/AArch64ISelLowering.h

Removed: 




diff  --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp 
b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
index 552545b854d8..9a0925061105 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
@@ -1746,6 +1746,7 @@ void 
DAGTypeLegalizer::SplitVecRes_MGATHER(MaskedGatherSDNode *MGT,
   SDValue PassThru = MGT->getPassThru();
   SDValue Index = MGT->getIndex();
   SDValue Scale = MGT->getScale();
+  EVT MemoryVT = MGT->getMemoryVT();
   Align Alignment = MGT->getOriginalAlign();
 
   // Split Mask operand
@@ -1759,6 +1760,10 @@ void 
DAGTypeLegalizer::SplitVecRes_MGATHER(MaskedGatherSDNode *MGT,
   std::tie(MaskLo, MaskHi) = DAG.SplitVector(Mask, dl);
   }
 
+  EVT LoMemVT, HiMemVT;
+  // Split MemoryVT
+  std::tie(LoMemVT, HiMemVT) = DAG.GetSplitDestVTs(MemoryVT);
+
   SDValue PassThruLo, PassThruHi;
   if (getTypeAction(PassThru.getValueType()) == 
TargetLowering::TypeSplitVector)
 GetSplitVector(PassThru, PassThruLo, PassThruHi);
@@ -1777,11 +1782,11 @@ void 
DAGTypeLegalizer::SplitVecRes_MGATHER(MaskedGatherSDNode *MGT,
   MGT->getRanges());
 
   SDValue OpsLo[] = {Ch, PassThruLo, MaskLo, Ptr, IndexLo, Scale};
-  Lo = DAG.getMaskedGather(DAG.getVTList(LoVT, MVT::Other), LoVT, dl, OpsLo,
+  Lo = DAG.getMaskedGather(DAG.getVTList(LoVT, MVT::Other), LoMemVT, dl, OpsLo,
MMO, MGT->getIndexType());
 
   SDValue OpsHi[] = {Ch, PassThruHi, MaskHi, Ptr, IndexHi, Scale};
-  Hi = DAG.getMaskedGather(DAG.getVTList(HiVT, MVT::Other), HiVT, dl, OpsHi,
+  Hi = DAG.getMaskedGather(DAG.getVTList(HiVT, MVT::Other), HiMemVT, dl, OpsHi,
MMO, MGT->getIndexType());
 
   // Build a factor node to remember that this load is independent of the
@@ -2421,11 +2426,11 @@ SDValue 
DAGTypeLegalizer::SplitVecOp_MGATHER(MaskedGatherSDNode *MGT,
   MGT->getRanges());
 
   SDValue OpsLo[] = {Ch, PassThruLo, MaskLo, Ptr, IndexLo, Scale};
-  SDValue Lo = DAG.getMaskedGather(DAG.getVTList(LoVT, MVT::Other), LoVT, dl,
+  SDValue Lo = DAG.getMaskedGather(DAG.getVTList(LoVT, MVT::Other), LoMemVT, 
dl,
OpsLo, MMO, MGT->getIndexType());
 
   SDValue OpsHi[] = {Ch, PassThruHi, MaskHi, Ptr, IndexHi, Scale};
-  SDValue Hi = DAG.getMaskedGather(DAG.getVTList(HiVT, MVT::Other), HiVT, dl,
+  SDValue Hi = DAG.getMaskedGather(DAG.getVTList(HiVT, MVT::Other), HiMemVT, 
dl,
OpsHi, MMO, MGT->getIndexType());
 
   // Build a factor node to remember that this load is independent of the

diff  --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index f6e131838a16..dd837d4d495f 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -7310,17 +7310,22 @@ SDValue SelectionDAG::getMaskedGather(SDVTList VTs, EVT 
VT, const SDLoc &dl,
 return SDValue(E, 0);
   }
 
+  IndexType = TLI->getCanonicalIndexType(IndexType, VT, Ops[4]);
   auto *N = newSDNode(dl.getIROrder(), dl.getDebugLoc(),
   VTs, VT, MMO, IndexType);
   createOperands(N, Ops);
 
   assert(N->getPassThru().getValueType() == N->getValueType(0) &&
  "Incompatible type of the PassTh

[llvm-branch-commits] [llvm] 111f559 - [SVE][CodeGen] Call refineIndexType & refineUniformBase from visitMGATHER

2020-12-07 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2020-12-07T13:20:19Z
New Revision: 111f559bbd12c59b0ac450ea2feb8f6981705647

URL: 
https://github.com/llvm/llvm-project/commit/111f559bbd12c59b0ac450ea2feb8f6981705647
DIFF: 
https://github.com/llvm/llvm-project/commit/111f559bbd12c59b0ac450ea2feb8f6981705647.diff

LOG: [SVE][CodeGen] Call refineIndexType & refineUniformBase from visitMGATHER

The refineIndexType & refineUniformBase functions added by D90942 can also be 
used to
improve CodeGen of masked gathers.

These changes were split out from D91092

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D92319

Added: 


Modified: 
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-scaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-unscaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-unsigned-scaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-unsigned-unscaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-64b-unscaled.ll
llvm/test/CodeGen/X86/masked_gather_scatter.ll

Removed: 




diff  --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp 
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 5481c52a5b12..96baaabdb813 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -9410,13 +9410,13 @@ bool refineUniformBase(SDValue &BasePtr, SDValue 
&Index, SelectionDAG &DAG) {
 }
 
 // Fold sext/zext of index into index type.
-bool refineIndexType(MaskedScatterSDNode *MSC, SDValue &Index, bool Scaled,
- SelectionDAG &DAG) {
+bool refineIndexType(MaskedGatherScatterSDNode *MGS, SDValue &Index,
+ bool Scaled, SelectionDAG &DAG) {
   const TargetLowering &TLI = DAG.getTargetLoweringInfo();
 
   if (Index.getOpcode() == ISD::ZERO_EXTEND) {
 SDValue Op = Index.getOperand(0);
-MSC->setIndexType(Scaled ? ISD::UNSIGNED_SCALED : ISD::UNSIGNED_UNSCALED);
+MGS->setIndexType(Scaled ? ISD::UNSIGNED_SCALED : ISD::UNSIGNED_UNSCALED);
 if (TLI.shouldRemoveExtendFromGSIndex(Op.getValueType())) {
   Index = Op;
   return true;
@@ -9425,7 +9425,7 @@ bool refineIndexType(MaskedScatterSDNode *MSC, SDValue 
&Index, bool Scaled,
 
   if (Index.getOpcode() == ISD::SIGN_EXTEND) {
 SDValue Op = Index.getOperand(0);
-MSC->setIndexType(Scaled ? ISD::SIGNED_SCALED : ISD::SIGNED_UNSCALED);
+MGS->setIndexType(Scaled ? ISD::SIGNED_SCALED : ISD::SIGNED_UNSCALED);
 if (TLI.shouldRemoveExtendFromGSIndex(Op.getValueType())) {
   Index = Op;
   return true;
@@ -9494,11 +9494,30 @@ SDValue DAGCombiner::visitMSTORE(SDNode *N) {
 SDValue DAGCombiner::visitMGATHER(SDNode *N) {
   MaskedGatherSDNode *MGT = cast(N);
   SDValue Mask = MGT->getMask();
+  SDValue Chain = MGT->getChain();
+  SDValue Index = MGT->getIndex();
+  SDValue Scale = MGT->getScale();
+  SDValue PassThru = MGT->getPassThru();
+  SDValue BasePtr = MGT->getBasePtr();
   SDLoc DL(N);
 
   // Zap gathers with a zero mask.
   if (ISD::isBuildVectorAllZeros(Mask.getNode()))
-return CombineTo(N, MGT->getPassThru(), MGT->getChain());
+return CombineTo(N, PassThru, MGT->getChain());
+
+  if (refineUniformBase(BasePtr, Index, DAG)) {
+SDValue Ops[] = {Chain, PassThru, Mask, BasePtr, Index, Scale};
+return DAG.getMaskedGather(DAG.getVTList(N->getValueType(0), MVT::Other),
+   PassThru.getValueType(), DL, Ops,
+   MGT->getMemOperand(), MGT->getIndexType());
+  }
+
+  if (refineIndexType(MGT, Index, MGT->isIndexScaled(), DAG)) {
+SDValue Ops[] = {Chain, PassThru, Mask, BasePtr, Index, Scale};
+return DAG.getMaskedGather(DAG.getVTList(N->getValueType(0), MVT::Other),
+   PassThru.getValueType(), DL, Ops,
+   MGT->getMemOperand(), MGT->getIndexType());
+  }
 
   return SDValue();
 }

diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index d729252c92d9..517f5e965157 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -3894,6 +3894,9 @@ SDValue AArch64TargetLowering::LowerMGATHER(SDValue Op,
 
   SDVTList VTs = DAG.getVTList(PassThru.getSimpleValueType(), MVT::Other);
 
+  if (getGatherScatterIndexIsExtended(Index))
+Index = Index.getOperand(0);
+
   SDValue Ops[] = {Chain, Mask, BasePtr, Index, InputVT, PassThru};
   return DAG.getNode(getGatherVecOpcode(IsScaled, IsSigned, IdxNeedsExtend), 
DL,
  VTs, Ops);

diff  --git a/llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-scaled.ll 
b/llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-scaled.ll
index 747468ae3cf4..32dca0d26cdc 100644
--- a/llvm/test/CodeGen/AArch64

[llvm-branch-commits] [llvm] 4519ff4 - [SVE][CodeGen] Add the ExtensionType flag to MGATHER

2020-12-09 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2020-12-09T11:19:08Z
New Revision: 4519ff4b6f02defcb69ea49bc11607cee09cde7b

URL: 
https://github.com/llvm/llvm-project/commit/4519ff4b6f02defcb69ea49bc11607cee09cde7b
DIFF: 
https://github.com/llvm/llvm-project/commit/4519ff4b6f02defcb69ea49bc11607cee09cde7b.diff

LOG: [SVE][CodeGen] Add the ExtensionType flag to MGATHER

Adds the ExtensionType flag, which reflects the LoadExtType of a 
MaskedGatherSDNode.
Also updated SelectionDAGDumper::print_details so that details of the gather
load (is signed, is scaled & extension type) are printed.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D91084

Added: 


Modified: 
llvm/include/llvm/CodeGen/SelectionDAG.h
llvm/include/llvm/CodeGen/SelectionDAGNodes.h
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
llvm/lib/Target/X86/X86ISelLowering.cpp

Removed: 




diff  --git a/llvm/include/llvm/CodeGen/SelectionDAG.h 
b/llvm/include/llvm/CodeGen/SelectionDAG.h
index d454c4ea8d9b..d73155aa2f2f 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAG.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAG.h
@@ -1362,7 +1362,7 @@ class SelectionDAG {
 ISD::MemIndexedMode AM);
   SDValue getMaskedGather(SDVTList VTs, EVT VT, const SDLoc &dl,
   ArrayRef Ops, MachineMemOperand *MMO,
-  ISD::MemIndexType IndexType);
+  ISD::MemIndexType IndexType, ISD::LoadExtType ExtTy);
   SDValue getMaskedScatter(SDVTList VTs, EVT VT, const SDLoc &dl,
ArrayRef Ops, MachineMemOperand *MMO,
ISD::MemIndexType IndexType,

diff  --git a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h 
b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
index 1e71d110730e..aa81a31bf23a 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
@@ -512,6 +512,7 @@ BEGIN_TWO_BYTE_PACK()
   class LoadSDNodeBitfields {
 friend class LoadSDNode;
 friend class MaskedLoadSDNode;
+friend class MaskedGatherSDNode;
 
 uint16_t : NumLSBaseSDNodeBits;
 
@@ -2451,12 +2452,18 @@ class MaskedGatherSDNode : public 
MaskedGatherScatterSDNode {
 
   MaskedGatherSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs,
  EVT MemVT, MachineMemOperand *MMO,
- ISD::MemIndexType IndexType)
+ ISD::MemIndexType IndexType, ISD::LoadExtType ETy)
   : MaskedGatherScatterSDNode(ISD::MGATHER, Order, dl, VTs, MemVT, MMO,
-  IndexType) {}
+  IndexType) {
+LoadSDNodeBits.ExtTy = ETy;
+  }
 
   const SDValue &getPassThru() const { return getOperand(1); }
 
+  ISD::LoadExtType getExtensionType() const {
+return ISD::LoadExtType(LoadSDNodeBits.ExtTy);
+  }
+
   static bool classof(const SDNode *N) {
 return N->getOpcode() == ISD::MGATHER;
   }

diff  --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp 
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 8f0c9542b3e7..ce4ee89103ce 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -9499,14 +9499,16 @@ SDValue DAGCombiner::visitMGATHER(SDNode *N) {
 SDValue Ops[] = {Chain, PassThru, Mask, BasePtr, Index, Scale};
 return DAG.getMaskedGather(DAG.getVTList(N->getValueType(0), MVT::Other),
PassThru.getValueType(), DL, Ops,
-   MGT->getMemOperand(), MGT->getIndexType());
+   MGT->getMemOperand(), MGT->getIndexType(),
+   MGT->getExtensionType());
   }
 
   if (refineIndexType(MGT, Index, MGT->isIndexScaled(), DAG)) {
 SDValue Ops[] = {Chain, PassThru, Mask, BasePtr, Index, Scale};
 return DAG.getMaskedGather(DAG.getVTList(N->getValueType(0), MVT::Other),
PassThru.getValueType(), DL, Ops,
-   MGT->getMemOperand(), MGT->getIndexType());
+   MGT->getMemOperand(), MGT->getIndexType(),
+   MGT->getExtensionType());
   }
 
   return SDValue();

diff  --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp 
b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
index 8468f51a922c..5c8a562ed9d7 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
@@ -679,12 +679,17 @@ SDValue 
DAGTypeLegalizer::PromoteIntRes_MGATHER(MaskedGatherSDNode *N) {
   

[llvm-branch-commits] [llvm] 05edfc5 - [SVE][CodeGen] Add DAG combines for s/zext_masked_gather

2020-12-09 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2020-12-09T11:53:19Z
New Revision: 05edfc54750bd539f5caa30b0cd4344f68677b00

URL: 
https://github.com/llvm/llvm-project/commit/05edfc54750bd539f5caa30b0cd4344f68677b00
DIFF: 
https://github.com/llvm/llvm-project/commit/05edfc54750bd539f5caa30b0cd4344f68677b00.diff

LOG: [SVE][CodeGen] Add DAG combines for s/zext_masked_gather

This patch adds the following DAGCombines, which apply if 
isVectorLoadExtDesirable() returns true:
 - fold (and (masked_gather x)) -> (zext_masked_gather x)
 - fold (sext_inreg (masked_gather x)) -> (sext_masked_gather x)

LowerMGATHER has also been updated to fetch the LoadExtType associated with the
gather and also use this value to determine the correct masked gather opcode to 
use.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D92230

Added: 


Modified: 
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-scaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-signed-unscaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-unsigned-scaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-32b-unsigned-unscaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-64b-scaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-64b-unscaled.ll
llvm/test/CodeGen/AArch64/sve-masked-gather-legalize.ll

Removed: 




diff  --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp 
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index ce4ee89103ce..212e0a2ea988 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -932,6 +932,33 @@ bool DAGCombiner::isOneUseSetCC(SDValue N) const {
   return false;
 }
 
+static bool isConstantSplatVectorMaskForType(SDNode *N, EVT ScalarTy) {
+  if (!ScalarTy.isSimple())
+return false;
+
+  uint64_t MaskForTy = 0ULL;
+  switch (ScalarTy.getSimpleVT().SimpleTy) {
+  case MVT::i8:
+MaskForTy = 0xFFULL;
+break;
+  case MVT::i16:
+MaskForTy = 0xULL;
+break;
+  case MVT::i32:
+MaskForTy = 0xULL;
+break;
+  default:
+return false;
+break;
+  }
+
+  APInt Val;
+  if (ISD::isConstantSplatVector(N, Val))
+return Val.getLimitedValue() == MaskForTy;
+
+  return false;
+}
+
 // Returns the SDNode if it is a constant float BuildVector
 // or constant float.
 static SDNode *isConstantFPBuildVectorOrConstantFP(SDValue N) {
@@ -5622,6 +5649,28 @@ SDValue DAGCombiner::visitAND(SDNode *N) {
 }
   }
 
+  // fold (and (masked_gather x)) -> (zext_masked_gather x)
+  if (auto *GN0 = dyn_cast(N0)) {
+EVT MemVT = GN0->getMemoryVT();
+EVT ScalarVT = MemVT.getScalarType();
+
+if (SDValue(GN0, 0).hasOneUse() &&
+isConstantSplatVectorMaskForType(N1.getNode(), ScalarVT) &&
+TLI.isVectorLoadExtDesirable(SDValue(SDValue(GN0, 0 {
+  SDValue Ops[] = {GN0->getChain(),   GN0->getPassThru(), GN0->getMask(),
+   GN0->getBasePtr(), GN0->getIndex(),GN0->getScale()};
+
+  SDValue ZExtLoad = DAG.getMaskedGather(
+  DAG.getVTList(VT, MVT::Other), MemVT, SDLoc(N), Ops,
+  GN0->getMemOperand(), GN0->getIndexType(), ISD::ZEXTLOAD);
+
+  CombineTo(N, ZExtLoad);
+  AddToWorklist(ZExtLoad.getNode());
+  // Avoid recheck of N.
+  return SDValue(N, 0);
+}
+  }
+
   // fold (and (load x), 255) -> (zextload x, i8)
   // fold (and (extload x, i16), 255) -> (zextload x, i8)
   // fold (and (any_ext (extload x, i16)), 255) -> (zextload x, i8)
@@ -11597,6 +11646,25 @@ SDValue DAGCombiner::visitSIGN_EXTEND_INREG(SDNode *N) 
{
 }
   }
 
+  // fold (sext_inreg (masked_gather x)) -> (sext_masked_gather x)
+  if (auto *GN0 = dyn_cast(N0)) {
+if (SDValue(GN0, 0).hasOneUse() &&
+ExtVT == GN0->getMemoryVT() &&
+TLI.isVectorLoadExtDesirable(SDValue(SDValue(GN0, 0 {
+  SDValue Ops[] = {GN0->getChain(),   GN0->getPassThru(), GN0->getMask(),
+   GN0->getBasePtr(), GN0->getIndex(),GN0->getScale()};
+
+  SDValue ExtLoad = DAG.getMaskedGather(
+  DAG.getVTList(VT, MVT::Other), ExtVT, SDLoc(N), Ops,
+  GN0->getMemOperand(), GN0->getIndexType(), ISD::SEXTLOAD);
+
+  CombineTo(N, ExtLoad);
+  CombineTo(N0.getNode(), ExtLoad, ExtLoad.getValue(1));
+  AddToWorklist(ExtLoad.getNode());
+  return SDValue(N, 0); // Return N so it doesn't get rechecked!
+}
+  }
+
   // Form (sext_inreg (bswap >> 16)) or (sext_inreg (rotl (bswap) 16))
   if (ExtVTBits <= 16 && N0.getOpcode() == ISD::OR) {
 if (SDValue BSwap = MatchBSwapHWordLow(N0.getNode(), N0.getOperand(0),

diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 20f5ded99350..5d9c66e170ea 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

[llvm-branch-commits] [llvm] abe7775 - [SVE][CodeGen] Extend index of masked gathers

2020-12-10 Thread Kerry McLaughlin via llvm-branch-commits

Author: Kerry McLaughlin
Date: 2020-12-10T13:54:45Z
New Revision: abe7775f5a43e5a0d8ec237542274ba3e73937e4

URL: 
https://github.com/llvm/llvm-project/commit/abe7775f5a43e5a0d8ec237542274ba3e73937e4
DIFF: 
https://github.com/llvm/llvm-project/commit/abe7775f5a43e5a0d8ec237542274ba3e73937e4.diff

LOG: [SVE][CodeGen] Extend index of masked gathers

This patch changes performMSCATTERCombine to also promote the indices of
masked gathers where the element type is i8 or i16, and adds various tests
for gathers with illegal types.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D91433

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/test/CodeGen/AArch64/sve-masked-gather-legalize.ll

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 5d9c66e170eab..01301abf10e3d 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -849,6 +849,7 @@ AArch64TargetLowering::AArch64TargetLowering(const 
TargetMachine &TM,
   if (Subtarget->supportsAddressTopByteIgnored())
 setTargetDAGCombine(ISD::LOAD);
 
+  setTargetDAGCombine(ISD::MGATHER);
   setTargetDAGCombine(ISD::MSCATTER);
 
   setTargetDAGCombine(ISD::MUL);
@@ -14063,20 +14064,19 @@ static SDValue performSTORECombine(SDNode *N,
   return SDValue();
 }
 
-static SDValue performMSCATTERCombine(SDNode *N,
+static SDValue performMaskedGatherScatterCombine(SDNode *N,
   TargetLowering::DAGCombinerInfo &DCI,
   SelectionDAG &DAG) {
-  MaskedScatterSDNode *MSC = cast(N);
-  assert(MSC && "Can only combine scatter store nodes");
+  MaskedGatherScatterSDNode *MGS = cast(N);
+  assert(MGS && "Can only combine gather load or scatter store nodes");
 
-  SDLoc DL(MSC);
-  SDValue Chain = MSC->getChain();
-  SDValue Scale = MSC->getScale();
-  SDValue Index = MSC->getIndex();
-  SDValue Data = MSC->getValue();
-  SDValue Mask = MSC->getMask();
-  SDValue BasePtr = MSC->getBasePtr();
-  ISD::MemIndexType IndexType = MSC->getIndexType();
+  SDLoc DL(MGS);
+  SDValue Chain = MGS->getChain();
+  SDValue Scale = MGS->getScale();
+  SDValue Index = MGS->getIndex();
+  SDValue Mask = MGS->getMask();
+  SDValue BasePtr = MGS->getBasePtr();
+  ISD::MemIndexType IndexType = MGS->getIndexType();
 
   EVT IdxVT = Index.getValueType();
 
@@ -14086,16 +14086,27 @@ static SDValue performMSCATTERCombine(SDNode *N,
 if ((IdxVT.getVectorElementType() == MVT::i8) ||
 (IdxVT.getVectorElementType() == MVT::i16)) {
   EVT NewIdxVT = IdxVT.changeVectorElementType(MVT::i32);
-  if (MSC->isIndexSigned())
+  if (MGS->isIndexSigned())
 Index = DAG.getNode(ISD::SIGN_EXTEND, DL, NewIdxVT, Index);
   else
 Index = DAG.getNode(ISD::ZERO_EXTEND, DL, NewIdxVT, Index);
 
-  SDValue Ops[] = { Chain, Data, Mask, BasePtr, Index, Scale };
-  return DAG.getMaskedScatter(DAG.getVTList(MVT::Other),
-  MSC->getMemoryVT(), DL, Ops,
-  MSC->getMemOperand(), IndexType,
-  MSC->isTruncatingStore());
+  if (auto *MGT = dyn_cast(MGS)) {
+SDValue PassThru = MGT->getPassThru();
+SDValue Ops[] = { Chain, PassThru, Mask, BasePtr, Index, Scale };
+return DAG.getMaskedGather(DAG.getVTList(N->getValueType(0), 
MVT::Other),
+   PassThru.getValueType(), DL, Ops,
+   MGT->getMemOperand(),
+   MGT->getIndexType(), 
MGT->getExtensionType());
+  } else {
+auto *MSC = cast(MGS);
+SDValue Data = MSC->getValue();
+SDValue Ops[] = { Chain, Data, Mask, BasePtr, Index, Scale };
+return DAG.getMaskedScatter(DAG.getVTList(MVT::Other),
+MSC->getMemoryVT(), DL, Ops,
+MSC->getMemOperand(), IndexType,
+MSC->isTruncatingStore());
+  }
 }
   }
 
@@ -15072,9 +15083,6 @@ static SDValue performGatherLoadCombine(SDNode *N, 
SelectionDAG &DAG,
 static SDValue
 performSignExtendInRegCombine(SDNode *N, TargetLowering::DAGCombinerInfo &DCI,
   SelectionDAG &DAG) {
-  if (DCI.isBeforeLegalizeOps())
-return SDValue();
-
   SDLoc DL(N);
   SDValue Src = N->getOperand(0);
   unsigned Opc = Src->getOpcode();
@@ -15109,6 +15117,9 @@ performSignExtendInRegCombine(SDNode *N, 
TargetLowering::DAGCombinerInfo &DCI,
 return DAG.getNode(SOpc, DL, N->getValueType(0), Ext);
   }
 
+  if (DCI.isBeforeLegalizeOps())
+return SDValue();
+
   if (!EnableCombineMGatherIntrinsics)
 return SDValue();
 
@@ -15296,8 +15307,9 @@ SDValue AArch64TargetLowering::PerformDAGCom

[llvm-branch-commits] [llvm] Cherry pick f314e12 into release/19.x (PR #117695)

2024-11-26 Thread Kerry McLaughlin via llvm-branch-commits

kmclaughlin-arm wrote:

Please can you include 
https://github.com/llvm/llvm-project/commit/9211977d134d81cbc7a24f13e244334484c31b87
 with these commits? This was another fix using requiresSaveVG which I think 
should be cherry-picked.

https://github.com/llvm/llvm-project/pull/117695
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Cherry pick f314e12 into release/19.x (PR #117695)

2024-11-26 Thread Kerry McLaughlin via llvm-branch-commits

https://github.com/kmclaughlin-arm approved this pull request.


https://github.com/llvm/llvm-project/pull/117695
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits