[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-09-05 Thread Ting Wang via Phabricator via cfe-commits
tingwang created this revision.
tingwang added reviewers: uweigand, wschmidt, PowerPC.
tingwang added a project: clang.
Herald added subscribers: shchenz, kbarton, nemanjai.
Herald added a project: All.
tingwang requested review of this revision.
Herald added a subscriber: cfe-commits.

This is an attempt to fix issue: 
https://github.com/llvm/llvm-project/issues/55900

PPC64_SVR4_ABI handles those by-value aggregate fits in one register using 
coerced integer type.
https://github.com/llvm/llvm-project/blob/51d33afcbe0a81bb8508d5685f38dc9fdb2b60c9/clang/lib/CodeGen/TargetInfo.cpp#L5351

Regarding the issue, the aggregate is passed using i8 as parameter. On 
big-endian, after register content stored to memory, the char locates at 7th 
byte. However current `PPC64_SVR4_ABIInfo::EmitVAArg()` generates argument 
access using the original type, so there is type mismatch between caller and 
callee.

This patch tries to teach `PPC64_SVR4_ABIInfo::EmitVAArg()` regarding the type 
coerce. I'm not sure if this should be fixed in clang or backend, but I guess 
in the clang more likely, since there is logic taking care of argument smaller 
than a slot:
https://github.com/llvm/llvm-project/blob/51d33afcbe0a81bb8508d5685f38dc9fdb2b60c9/clang/lib/CodeGen/TargetInfo.cpp#L356

Please help me review and let me know if any comments. Thank you!


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D18

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGen/PowerPC/ppc64-align-struct.c


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 -no-opaque-pointers -target-feature +altivec -triple 
powerpc64-unknown-linux-gnu -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -no-opaque-pointers -target-feature +altivec -triple 
powerpc64le-unknown-linux-gnu -emit-llvm -o - %s | FileCheck %s 
--check-prefix=CHECK-LE
 
 #include 
 
@@ -9,6 +10,7 @@
 struct test5 { int x[17]; };
 struct test6 { int x[17]; } __attribute__((aligned (16)));
 struct test7 { int x[17]; } __attribute__((aligned (32)));
+struct test8 { char x; };
 
 // CHECK: define{{.*}} void @test1(i32 noundef signext %x, i64 %y.coerce)
 void test1 (int x, struct test1 y)
@@ -132,20 +134,17 @@
 // CHECK: %[[CUR:[^ ]+]] = load i8*, i8** %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, i8* %[[CUR]], i64 8
 // CHECK: store i8* %[[NEXT]], i8** %ap
-// CHECK: [[T0:%.*]] = bitcast i8* %[[CUR]] to %struct.test8*
+// CHECK: [[SRC:%.*]] = getelementptr inbounds i8, i8* %[[CUR]], i64 7
 // CHECK: [[DEST:%.*]] = bitcast %struct.test8* %[[AGG_RESULT]] to i8*
-// CHECK: [[SRC:%.*]] = bitcast %struct.test8* [[T0]] to i8*
-// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 [[DEST]], i8* align 
8 [[SRC]], i64 1, i1 false)
+// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 [[DEST]], i8* align 
1 [[SRC]], i64 1, i1 false)
 
 // CHECK-LE: define{{.*}} i8 @test8va(i32 noundef signext %x, ...)
 // CHECK-LE: [[RETVAL:%.*]] = alloca %struct.test8
 // CHECK-LE: %[[CUR:[^ ]+]] = load i8*, i8** %ap
 // CHECK-LE: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, i8* %[[CUR]], i64 8
 // CHECK-LE: store i8* %[[NEXT]], i8** %ap
-// CHECK-LE: [[T0:%.*]] = bitcast i8* %[[CUR]] to %struct.test8*
 // CHECK-LE: [[DEST:%.*]] = bitcast %struct.test8* [[RETVAL]] to i8*
-// CHECK-LE: [[SRC:%.*]] = bitcast %struct.test8* [[T0]] to i8*
-// CHECK-LE: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 [[DEST]], i8* 
align 8 [[SRC]], i64 1, i1 false)
+// CHECK-LE: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 [[DEST]], i8* 
align 8 %[[CUR]], i64 1, i1 false)
 // CHECK-LE: [[COERCE:%.*]] = getelementptr inbounds %struct.test8, 
%struct.test8* [[RETVAL]], i32 0, i32 0
 // CHECK-LE: [[RET:%.*]] = load i8, i8* [[COERCE]], align 1
 // CHECK-LE: ret i8 [[RET]]
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -5451,6 +5451,22 @@
   return complexTempStructure(CGF, VAListAddr, Ty, SlotSize, EltSize, CTy);
   }
 
+  // An aggregate may end up coerced to integer type in single register. When
+  // DirectSize is less than SlotSize on big-endian, need to use coerced type 
so
+  // that the argument will be right-adjusted in its slot.
+  ABIArgInfo AI = classifyArgumentType(Ty);
+  if (AI.isDirect() && AI.getCoerceToType()) {
+llvm::Type *CoerceTy = AI.getCoerceToType();
+if (CoerceTy->isIntegerTy() &&
+llvm::alignTo(CoerceTy->getIntegerBitWidth(), 8) < GPRBits)
+  return emitVoidPtrDirectVAArg(
+  CGF, VAListAddr, CoerceTy,
+  CharUnits::fromQuantity(
+  llvm::alignTo(CoerceTy->getIntegerBitWidth(), 8) / 8),
+  CharUnits::fromQuantity(AI.getDirectAlign()), SlotSize,
+  /*

[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-09-07 Thread Ting Wang via Phabricator via cfe-commits
tingwang added a comment.

> It looks like the only change needed for ppc would be to remove the 
> `!DirectTy->isStructTy()` check here?   (I guess to avoid inadvertently 
> change other targets, this might need to be triggered by a flag passed as 
> argument.  On the other hand, maybe there is no other big-endian platform 
> using `emitVoidPtrVAArg` anyway?)

Thank you! Looked a little bit into history: the `!DirectTy->isStructTy()` 
check is specifically added in https://reviews.llvm.org/D21611. I will update 
patch to add a flag for PPC64 passed as argument.

During investigate, I noticed an issue caused by transition to opaque-pointer 
in test case introduced by D21611 : the case 
CHECK-NOT contains typed pointer will not match clang generated opaque-pointer 
in any way, so these tests are invalidated and here is risk until these test 
cases are updated.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133488: [clang][PowerPC][NFC] Add base test case for PPC64 VAArg aggregate smaller than a slot

2022-09-08 Thread Ting Wang via Phabricator via cfe-commits
tingwang created this revision.
tingwang added reviewers: uweigand, wschmidt, PowerPC.
tingwang added a project: clang.
Herald added subscribers: shchenz, kbarton, nemanjai.
Herald added a project: All.
tingwang requested review of this revision.
Herald added a subscriber: cfe-commits.

Add base test case for https://reviews.llvm.org/D18.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D133488

Files:
  clang/test/CodeGen/PowerPC/ppc64-align-struct.c


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -9,6 +9,7 @@
 struct test5 { int x[17]; };
 struct test6 { int x[17]; } __attribute__((aligned (16)));
 struct test7 { int x[17]; } __attribute__((aligned (32)));
+struct test8 { char x; };
 
 // CHECK: define{{.*}} void @test1(i32 noundef signext %x, i64 %y.coerce)
 void test1 (int x, struct test1 y)
@@ -128,6 +129,25 @@
   return y;
 }
 
+// Error pattern will be fixed in https://reviews.llvm.org/D18
+// CHECK: define{{.*}} void @test8va(%struct.test8* noalias 
sret(%struct.test8) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
+// CHECK: %[[CUR:[^ ]+]] = load i8*, i8** %ap
+// CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, i8* %[[CUR]], i64 8
+// CHECK: store i8* %[[NEXT]], i8** %ap
+// CHECK: [[T0:%.*]] = bitcast i8* %[[CUR]] to %struct.test8*
+// CHECK: [[DEST:%.*]] = bitcast %struct.test8* %[[AGG_RESULT]] to i8*
+// CHECK: [[SRC:%.*]] = bitcast %struct.test8* [[T0]] to i8*
+// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 [[DEST]], i8* align 
8 [[SRC]], i64 1, i1 false)
+struct test8 test8va (int x, ...)
+{
+  struct test8 y;
+  va_list ap;
+  va_start(ap, x);
+  y = va_arg (ap, struct test8);
+  va_end(ap);
+  return y;
+}
+
 // CHECK: define{{.*}} void @testva_longdouble(%struct.test_longdouble* 
noalias sret(%struct.test_longdouble) align 16 %[[AGG_RESULT:.*]], i32 noundef 
signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load i8*, i8** %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, i8* %[[CUR]], i64 16


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -9,6 +9,7 @@
 struct test5 { int x[17]; };
 struct test6 { int x[17]; } __attribute__((aligned (16)));
 struct test7 { int x[17]; } __attribute__((aligned (32)));
+struct test8 { char x; };
 
 // CHECK: define{{.*}} void @test1(i32 noundef signext %x, i64 %y.coerce)
 void test1 (int x, struct test1 y)
@@ -128,6 +129,25 @@
   return y;
 }
 
+// Error pattern will be fixed in https://reviews.llvm.org/D18
+// CHECK: define{{.*}} void @test8va(%struct.test8* noalias sret(%struct.test8) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
+// CHECK: %[[CUR:[^ ]+]] = load i8*, i8** %ap
+// CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, i8* %[[CUR]], i64 8
+// CHECK: store i8* %[[NEXT]], i8** %ap
+// CHECK: [[T0:%.*]] = bitcast i8* %[[CUR]] to %struct.test8*
+// CHECK: [[DEST:%.*]] = bitcast %struct.test8* %[[AGG_RESULT]] to i8*
+// CHECK: [[SRC:%.*]] = bitcast %struct.test8* [[T0]] to i8*
+// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 [[DEST]], i8* align 8 [[SRC]], i64 1, i1 false)
+struct test8 test8va (int x, ...)
+{
+  struct test8 y;
+  va_list ap;
+  va_start(ap, x);
+  y = va_arg (ap, struct test8);
+  va_end(ap);
+  return y;
+}
+
 // CHECK: define{{.*}} void @testva_longdouble(%struct.test_longdouble* noalias sret(%struct.test_longdouble) align 16 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load i8*, i8** %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, i8* %[[CUR]], i64 16
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-09-08 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 458721.
tingwang added a comment.

Update according to comments:
(1) The argument type is not touched.
(2) Add flag in APIs to force right-adjust the parameter for this issue.
(3) One change on if check logic: now use CoerceTy->getIntegerBitWidth() 
directly compare with GPRBits, previously used 
llvm::alignTo(CoerceTy->getIntegerBitWidth(), 8). I think `llvm::alignTo()` is 
redundant here.
(4) Created NFC patch to pre-commit the test case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGen/PowerPC/ppc64-align-struct.c

Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -129,15 +129,15 @@
   return y;
 }
 
-// Error pattern will be fixed in https://reviews.llvm.org/D18
 // CHECK: define{{.*}} void @test8va(%struct.test8* noalias sret(%struct.test8) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load i8*, i8** %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, i8* %[[CUR]], i64 8
 // CHECK: store i8* %[[NEXT]], i8** %ap
-// CHECK: [[T0:%.*]] = bitcast i8* %[[CUR]] to %struct.test8*
+// CHECK: [[T0:%.*]] = getelementptr inbounds i8, i8* %[[CUR]], i64 7
+// CHECK: [[T1:%.*]] = bitcast i8* [[T0]] to %struct.test8*
 // CHECK: [[DEST:%.*]] = bitcast %struct.test8* %[[AGG_RESULT]] to i8*
-// CHECK: [[SRC:%.*]] = bitcast %struct.test8* [[T0]] to i8*
-// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 [[DEST]], i8* align 8 [[SRC]], i64 1, i1 false)
+// CHECK: [[SRC:%.*]] = bitcast %struct.test8* [[T1]] to i8*
+// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 [[DEST]], i8* align 1 [[SRC]], i64 1, i1 false)
 struct test8 test8va (int x, ...)
 {
   struct test8 y;
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -322,13 +322,17 @@
 ///   leaving one or more empty slots behind as padding.  If this
 ///   is false, the returned address might be less-aligned than
 ///   DirectAlign.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrDirectVAArg(CodeGenFunction &CGF,
   Address VAListAddr,
   llvm::Type *DirectTy,
   CharUnits DirectSize,
   CharUnits DirectAlign,
   CharUnits SlotSize,
-  bool AllowHigherAlign) {
+  bool AllowHigherAlign,
+  bool ForceRightAdjust = false) {
   // Cast the element type to i8* if necessary.  Some platforms define
   // va_list as a struct containing an i8* instead of just an i8*.
   if (VAListAddr.getElementType() != CGF.Int8PtrTy)
@@ -354,7 +358,7 @@
   // If the argument is smaller than a slot, and this is a big-endian
   // target, the argument will be right-adjusted in its slot.
   if (DirectSize < SlotSize && CGF.CGM.getDataLayout().isBigEndian() &&
-  !DirectTy->isStructTy()) {
+  (!DirectTy->isStructTy() || ForceRightAdjust)) {
 Addr = CGF.Builder.CreateConstInBoundsByteGEP(Addr, SlotSize - DirectSize);
   }
 
@@ -375,11 +379,15 @@
 ///   an argument type with an alignment greater than the slot size
 ///   will be emitted on a higher-alignment address, potentially
 ///   leaving one or more empty slots behind as padding.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrVAArg(CodeGenFunction &CGF, Address VAListAddr,
 QualType ValueTy, bool IsIndirect,
 TypeInfoChars ValueInfo,
 CharUnits SlotSizeAndAlign,
-bool AllowHigherAlign) {
+bool AllowHigherAlign,
+bool ForceRightAdjust = false) {
   // The size and alignment of the value that was passed directly.
   CharUnits DirectSize, DirectAlign;
   if (IsIndirect) {
@@ -395,9 +403,9 @@
   if (IsIndirect)
 DirectTy = DirectTy->getPointerTo(0);
 
-  Address Addr =
-  emitVoidPtrDirectVAArg(CGF, VAListAddr, DirectTy, DirectSize, DirectAlign,
- SlotSizeAndAlign, AllowHighe

[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-09-08 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 458735.
tingwang added a comment.

Update according to comments:
Remove all those checks, they are redundant.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGen/PowerPC/ppc64-align-struct.c


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -129,15 +129,15 @@
   return y;
 }
 
-// Error pattern will be fixed in https://reviews.llvm.org/D18
 // CHECK: define{{.*}} void @test8va(%struct.test8* noalias 
sret(%struct.test8) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load i8*, i8** %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, i8* %[[CUR]], i64 8
 // CHECK: store i8* %[[NEXT]], i8** %ap
-// CHECK: [[T0:%.*]] = bitcast i8* %[[CUR]] to %struct.test8*
+// CHECK: [[T0:%.*]] = getelementptr inbounds i8, i8* %[[CUR]], i64 7
+// CHECK: [[T1:%.*]] = bitcast i8* [[T0]] to %struct.test8*
 // CHECK: [[DEST:%.*]] = bitcast %struct.test8* %[[AGG_RESULT]] to i8*
-// CHECK: [[SRC:%.*]] = bitcast %struct.test8* [[T0]] to i8*
-// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 [[DEST]], i8* align 
8 [[SRC]], i64 1, i1 false)
+// CHECK: [[SRC:%.*]] = bitcast %struct.test8* [[T1]] to i8*
+// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 [[DEST]], i8* align 
1 [[SRC]], i64 1, i1 false)
 struct test8 test8va (int x, ...)
 {
   struct test8 y;
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -322,13 +322,17 @@
 ///   leaving one or more empty slots behind as padding.  If this
 ///   is false, the returned address might be less-aligned than
 ///   DirectAlign.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrDirectVAArg(CodeGenFunction &CGF,
   Address VAListAddr,
   llvm::Type *DirectTy,
   CharUnits DirectSize,
   CharUnits DirectAlign,
   CharUnits SlotSize,
-  bool AllowHigherAlign) {
+  bool AllowHigherAlign,
+  bool ForceRightAdjust = false) {
   // Cast the element type to i8* if necessary.  Some platforms define
   // va_list as a struct containing an i8* instead of just an i8*.
   if (VAListAddr.getElementType() != CGF.Int8PtrTy)
@@ -354,7 +358,7 @@
   // If the argument is smaller than a slot, and this is a big-endian
   // target, the argument will be right-adjusted in its slot.
   if (DirectSize < SlotSize && CGF.CGM.getDataLayout().isBigEndian() &&
-  !DirectTy->isStructTy()) {
+  (!DirectTy->isStructTy() || ForceRightAdjust)) {
 Addr = CGF.Builder.CreateConstInBoundsByteGEP(Addr, SlotSize - DirectSize);
   }
 
@@ -375,11 +379,15 @@
 ///   an argument type with an alignment greater than the slot size
 ///   will be emitted on a higher-alignment address, potentially
 ///   leaving one or more empty slots behind as padding.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrVAArg(CodeGenFunction &CGF, Address VAListAddr,
 QualType ValueTy, bool IsIndirect,
 TypeInfoChars ValueInfo,
 CharUnits SlotSizeAndAlign,
-bool AllowHigherAlign) {
+bool AllowHigherAlign,
+bool ForceRightAdjust = false) {
   // The size and alignment of the value that was passed directly.
   CharUnits DirectSize, DirectAlign;
   if (IsIndirect) {
@@ -395,9 +403,9 @@
   if (IsIndirect)
 DirectTy = DirectTy->getPointerTo(0);
 
-  Address Addr =
-  emitVoidPtrDirectVAArg(CGF, VAListAddr, DirectTy, DirectSize, 
DirectAlign,
- SlotSizeAndAlign, AllowHigherAlign);
+  Address Addr = emitVoidPtrDirectVAArg(CGF, VAListAddr, DirectTy, DirectSize,
+DirectAlign, SlotSizeAndAlign,
+AllowHigherAlign, ForceRightAdjust);
 
   if (IsIndirect) {
 Addr = Address(CGF.Builder.CreateLoad(Addr), ElementTy, ValueIn

[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-09-08 Thread Ting Wang via Phabricator via cfe-commits
tingwang added inline comments.



Comment at: clang/lib/CodeGen/TargetInfo.cpp:5471
+if (CoerceTy->isIntegerTy() && CoerceTy->getIntegerBitWidth() < 
GPRBits)
+  ForceRightAdjust = true;
+  }

uweigand wrote:
> Are all these checks really necessary here?  This seems to duplicate the 
> checks that are already in `emitVoidPtrVAArg` ...Can't we simply always 
> pass `true` for `ForceRightAdjust` on PowerPC?
Ah, yes. All these checks are redundant, and this essentially reverts back to 
before the commit that broke this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133749: [clang][NFC] Update test case struct-union-BE.c for opaque-pointers

2022-09-12 Thread Ting Wang via Phabricator via cfe-commits
tingwang created this revision.
tingwang added reviewers: dsanders, rjmccall, spetrovic, vkalintiris, 
john.brawn, petarj.
tingwang added a project: clang.
Herald added a project: All.
tingwang requested review of this revision.
Herald added a subscriber: cfe-commits.

Update patterns in this test case to align with opaque-pointers, otherwise it 
will not be able to catch errors.

This is spotted during https://reviews.llvm.org/D18 investigation. During 
test after removed `!DirectTy->isStructTy()`, this test case still passed.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D133749

Files:
  clang/test/CodeGen/struct-union-BE.c


Index: clang/test/CodeGen/struct-union-BE.c
===
--- clang/test/CodeGen/struct-union-BE.c
+++ clang/test/CodeGen/struct-union-BE.c
@@ -22,9 +22,9 @@
   if (x.c !=  10)
 abort();
   va_end (ap);
-// MIPS-NOT: %{{[0-9]+}} = getelementptr inbounds i8, i8* %argp.cur, i32 3
-// MIPS64-NOT: %{{[0-9]+}} = getelementptr inbounds i8, i8* %argp.cur, i64 7
-// ARM-NOT: %{{[0-9]+}} = getelementptr inbounds i8, i8* %argp.cur, i32 3
+// MIPS-NOT: %{{[0-9]+}} = getelementptr inbounds i8, ptr %argp.cur, i32 3
+// MIPS64-NOT: %{{[0-9]+}} = getelementptr inbounds i8, ptr %argp.cur, i64 7
+// ARM-NOT: %{{[0-9]+}} = getelementptr inbounds i8, ptr %argp.cur, i32 3
 }
 
 void funi(int n, ...) {
@@ -35,9 +35,9 @@
   if (x.c !=  10)
 abort();
   va_end (ap);
-// MIPS-NOT: %{{[0-9]+}} = getelementptr inbounds i8, i8* %argp.cur, i32 3
-// MIPS64-NOT: %{{[0-9]+}} = getelementptr inbounds i8, i8* %argp.cur, i64 7
-// ARM-NOT: %{{[0-9]+}} = getelementptr inbounds i8, i8* %argp.cur, i32 3
+// MIPS-NOT: %{{[0-9]+}} = getelementptr inbounds i8, ptr %argp.cur, i32 3
+// MIPS64-NOT: %{{[0-9]+}} = getelementptr inbounds i8, ptr %argp.cur, i64 7
+// ARM-NOT: %{{[0-9]+}} = getelementptr inbounds i8, ptr %argp.cur, i32 3
 }
 
 void foo(void) {


Index: clang/test/CodeGen/struct-union-BE.c
===
--- clang/test/CodeGen/struct-union-BE.c
+++ clang/test/CodeGen/struct-union-BE.c
@@ -22,9 +22,9 @@
   if (x.c !=  10)
 abort();
   va_end (ap);
-// MIPS-NOT: %{{[0-9]+}} = getelementptr inbounds i8, i8* %argp.cur, i32 3
-// MIPS64-NOT: %{{[0-9]+}} = getelementptr inbounds i8, i8* %argp.cur, i64 7
-// ARM-NOT: %{{[0-9]+}} = getelementptr inbounds i8, i8* %argp.cur, i32 3
+// MIPS-NOT: %{{[0-9]+}} = getelementptr inbounds i8, ptr %argp.cur, i32 3
+// MIPS64-NOT: %{{[0-9]+}} = getelementptr inbounds i8, ptr %argp.cur, i64 7
+// ARM-NOT: %{{[0-9]+}} = getelementptr inbounds i8, ptr %argp.cur, i32 3
 }
 
 void funi(int n, ...) {
@@ -35,9 +35,9 @@
   if (x.c !=  10)
 abort();
   va_end (ap);
-// MIPS-NOT: %{{[0-9]+}} = getelementptr inbounds i8, i8* %argp.cur, i32 3
-// MIPS64-NOT: %{{[0-9]+}} = getelementptr inbounds i8, i8* %argp.cur, i64 7
-// ARM-NOT: %{{[0-9]+}} = getelementptr inbounds i8, i8* %argp.cur, i32 3
+// MIPS-NOT: %{{[0-9]+}} = getelementptr inbounds i8, ptr %argp.cur, i32 3
+// MIPS64-NOT: %{{[0-9]+}} = getelementptr inbounds i8, ptr %argp.cur, i64 7
+// ARM-NOT: %{{[0-9]+}} = getelementptr inbounds i8, ptr %argp.cur, i32 3
 }
 
 void foo(void) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-09-12 Thread Ting Wang via Phabricator via cfe-commits
tingwang added a comment.

(Inviting reviewers from D21611  to look into 
the change here.)

The expected pattern `test8va()` added in this patch was broken by D21611 
. Now as suggested by Ulrich, we are planning 
to create a flag for the original case. I have verified the functionality works 
with all kinds of aggregates smaller than SlotSize on PPC64. (If necessary, I 
can add those test case borrowed from 
clang/test/CodeGen/PowerPC/ppc-aggregate-abi.cpp) Let me know if you have any 
comments. Thank you!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2022-06-27 Thread Ting Wang via Phabricator via cfe-commits
tingwang added a comment.

Gentle ping.

Verified the patch works with latest code base, all tests green.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2022-07-05 Thread Ting Wang via Phabricator via cfe-commits
tingwang added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:522
   EmitCXXThreadLocalInitFunc();
+  if (getTriple().isOSAIX()) {
+genAssocMeta();

shchenz wrote:
> Seems this dos not follow other functions call's style. Can we call a 
> function like `EmitAssociatedMetadata()` here and do the clean up 
> (`cleanupAssoc()` may not be needed) in the `EmitAssociatedMetadata()`? 
Thanks! I will update the code.



Comment at: clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp:2
+// RUN: %clang_cc1 -triple powerpc64-ibm-aix-xcoff -emit-llvm -O3 -x c++ \
+// RUN: -debug-info-kind=limited < %s | \
+// RUN:   FileCheck %s

shchenz wrote:
> is `-debug-info-kind=limited` or `-O3` necessary in this test? Same as other 
> new added cases.
Oh, "-debug-info-kind=limited" is not required. I will remove those. The "-O3" 
flag is required to show that associated metadata can be `nullptr`. Without 
"-O3", normal llvm.global_ctors will be generated.



Comment at: clang/test/CodeGen/PowerPC/aix-ref-tls_init.cpp:10
+// CHECK: @r = thread_local global ptr null, align [[ALIGN:[0-9]+]], !dbg 
![[DBG0:[0-9]+]], !associated ![[ASSOC0:[0-9]+]]
+// CHECK: ![[ASSOC0]] = !{ptr @__tls_init}

shchenz wrote:
> Not sure if this is right or not. XLC on AIX seems refer to `__tls_get_addr` 
> instead  of `__tls_init`...
I saw some case that AIX generated tls related association, and I was 
postulating that the association should be linked to _tls_init. I will revisit 
my case and update with more info or correct the association later.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2022-07-05 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 442419.
tingwang added a comment.

Update according to comments:
(1) Merged cleanupAssoc() into genAssocMeta(), and renamed genAssocMeta() to 
EmitAssociatedMetadata().
(2) Removed "-debug-info-kind=limited" from all test cases.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

Files:
  clang/lib/CodeGen/CGDecl.cpp
  clang/lib/CodeGen/CGDeclCXX.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp
  clang/test/CodeGen/PowerPC/aix-ref-static-var.cpp
  clang/test/CodeGen/PowerPC/aix-ref-tls_init.cpp
  clang/test/CodeGenCXX/aix-static-init-debug-info.cpp
  clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
  clang/test/CodeGenCXX/aix-static-init.cpp
  llvm/docs/LangRef.rst

Index: llvm/docs/LangRef.rst
===
--- llvm/docs/LangRef.rst
+++ llvm/docs/LangRef.rst
@@ -7091,6 +7091,10 @@
 @b = internal global i32 2, comdat $a, section "abc", !associated !0
 !0 = !{i32* @a}
 
+On XCOFF target, the ``associated`` metadata indicates connection among static
+variables (static global variable, static class member etc.) and static init/
+term functions. This metadata lowers to ``.ref`` assembler pseudo-operation
+which prevents discarding of the functions in linker GC.
 
 '``prof``' Metadata
 ^^^
Index: clang/test/CodeGenCXX/aix-static-init.cpp
===
--- clang/test/CodeGenCXX/aix-static-init.cpp
+++ clang/test/CodeGenCXX/aix-static-init.cpp
@@ -38,6 +38,10 @@
   }
 } // namespace test4
 
+// CHECK: @_ZN5test12t1E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test21xE = global i32 0, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
+// CHECK: @_ZN5test31tE = global %"struct.test3::Test3" undef, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
 // CHECK: @_ZGVZN5test41fEvE11staticLocal = internal global i64 0, align 8
 // CHECK: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__sub_I__, i8* null }]
 // CHECK: @llvm.global_dtors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__D_a, i8* null }]
@@ -49,7 +53,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(%"struct.test1::Test1"* @_ZN5test12t1E)
 // CHECK:   ret void
@@ -80,7 +84,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(%"struct.test1::Test1"* @_ZN5test12t2E)
 // CHECK:   ret void
@@ -114,7 +118,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test35Test3D1Ev(%"struct.test3::Test3"* @_ZN5test31tE)
 // CHECK:   ret void
@@ -155,7 +159,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test45Test4D1Ev(%"struct.test4::Test4"* @_ZZN5test41fEvE11staticLocal)
 // CHECK:   ret void
@@ -192,3 +196,7 @@
 // CHECK:   call void @__finalize__ZN5test12t1E()
 // CHECK:   ret void
 // CHECK: }
+
+// CHECK: ![[ASSOC0]] = !{void ()* @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}, void ()* @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}}
+// CHECK: ![[ASSOC1]] = !{void ()* @_GLOBAL__sub_I__}
+// CHECK: ![[ASSOC2]] = !{void ()* @_GLOBAL__D_a}
Index: clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
===
--- clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
+++ clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
@@ -44,8 +44,13 @@
 A A::instance = bar();
 } // namespace test2
 
+// CHECK: @_ZN5test12t0E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = linkonce_odr global %"struct.test1::Test1"

[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2022-07-06 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 442773.
tingwang added a comment.

Drop TLS related .ref for now


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

Files:
  clang/lib/CodeGen/CGDecl.cpp
  clang/lib/CodeGen/CGDeclCXX.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp
  clang/test/CodeGen/PowerPC/aix-ref-static-var.cpp
  clang/test/CodeGenCXX/aix-static-init-debug-info.cpp
  clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
  clang/test/CodeGenCXX/aix-static-init.cpp
  llvm/docs/LangRef.rst

Index: llvm/docs/LangRef.rst
===
--- llvm/docs/LangRef.rst
+++ llvm/docs/LangRef.rst
@@ -7091,6 +7091,10 @@
 @b = internal global i32 2, comdat $a, section "abc", !associated !0
 !0 = !{i32* @a}
 
+On XCOFF target, the ``associated`` metadata indicates connection among static
+variables (static global variable, static class member etc.) and static init/
+term functions. This metadata lowers to ``.ref`` assembler pseudo-operation
+which prevents discarding of the functions in linker GC.
 
 '``prof``' Metadata
 ^^^
Index: clang/test/CodeGenCXX/aix-static-init.cpp
===
--- clang/test/CodeGenCXX/aix-static-init.cpp
+++ clang/test/CodeGenCXX/aix-static-init.cpp
@@ -38,6 +38,10 @@
   }
 } // namespace test4
 
+// CHECK: @_ZN5test12t1E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test21xE = global i32 0, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
+// CHECK: @_ZN5test31tE = global %"struct.test3::Test3" undef, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
 // CHECK: @_ZGVZN5test41fEvE11staticLocal = internal global i64 0, align 8
 // CHECK: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__sub_I__, i8* null }]
 // CHECK: @llvm.global_dtors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__D_a, i8* null }]
@@ -49,7 +53,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(%"struct.test1::Test1"* @_ZN5test12t1E)
 // CHECK:   ret void
@@ -80,7 +84,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(%"struct.test1::Test1"* @_ZN5test12t2E)
 // CHECK:   ret void
@@ -114,7 +118,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test35Test3D1Ev(%"struct.test3::Test3"* @_ZN5test31tE)
 // CHECK:   ret void
@@ -155,7 +159,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test45Test4D1Ev(%"struct.test4::Test4"* @_ZZN5test41fEvE11staticLocal)
 // CHECK:   ret void
@@ -192,3 +196,7 @@
 // CHECK:   call void @__finalize__ZN5test12t1E()
 // CHECK:   ret void
 // CHECK: }
+
+// CHECK: ![[ASSOC0]] = !{void ()* @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}, void ()* @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}}
+// CHECK: ![[ASSOC1]] = !{void ()* @_GLOBAL__sub_I__}
+// CHECK: ![[ASSOC2]] = !{void ()* @_GLOBAL__D_a}
Index: clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
===
--- clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
+++ clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
@@ -44,8 +44,13 @@
 A A::instance = bar();
 } // namespace test2
 
+// CHECK: @_ZN5test12t0E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = linkonce_odr global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
 // CHECK: @_ZGVN5test12t2E = linkonce_odr global i64 0, align 8
+// CHECK: @_ZN5test12t1IiEE = linkonce_odr global %"struct.test1::Test1" zeroinit

[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2022-07-08 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 443201.
tingwang added a comment.

Add guards against TLS variables.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

Files:
  clang/lib/CodeGen/CGDecl.cpp
  clang/lib/CodeGen/CGDeclCXX.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp
  clang/test/CodeGen/PowerPC/aix-ref-static-var.cpp
  clang/test/CodeGenCXX/aix-static-init-debug-info.cpp
  clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
  clang/test/CodeGenCXX/aix-static-init.cpp
  llvm/docs/LangRef.rst

Index: llvm/docs/LangRef.rst
===
--- llvm/docs/LangRef.rst
+++ llvm/docs/LangRef.rst
@@ -7091,6 +7091,10 @@
 @b = internal global i32 2, comdat $a, section "abc", !associated !0
 !0 = !{i32* @a}
 
+On XCOFF target, the ``associated`` metadata indicates connection among static
+variables (static global variable, static class member etc.) and static init/
+term functions. This metadata lowers to ``.ref`` assembler pseudo-operation
+which prevents discarding of the functions in linker GC.
 
 '``prof``' Metadata
 ^^^
Index: clang/test/CodeGenCXX/aix-static-init.cpp
===
--- clang/test/CodeGenCXX/aix-static-init.cpp
+++ clang/test/CodeGenCXX/aix-static-init.cpp
@@ -38,6 +38,10 @@
   }
 } // namespace test4
 
+// CHECK: @_ZN5test12t1E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test21xE = global i32 0, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
+// CHECK: @_ZN5test31tE = global %"struct.test3::Test3" undef, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
 // CHECK: @_ZGVZN5test41fEvE11staticLocal = internal global i64 0, align 8
 // CHECK: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__sub_I__, i8* null }]
 // CHECK: @llvm.global_dtors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__D_a, i8* null }]
@@ -49,7 +53,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(%"struct.test1::Test1"* @_ZN5test12t1E)
 // CHECK:   ret void
@@ -80,7 +84,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(%"struct.test1::Test1"* @_ZN5test12t2E)
 // CHECK:   ret void
@@ -114,7 +118,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test35Test3D1Ev(%"struct.test3::Test3"* @_ZN5test31tE)
 // CHECK:   ret void
@@ -155,7 +159,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test45Test4D1Ev(%"struct.test4::Test4"* @_ZZN5test41fEvE11staticLocal)
 // CHECK:   ret void
@@ -192,3 +196,7 @@
 // CHECK:   call void @__finalize__ZN5test12t1E()
 // CHECK:   ret void
 // CHECK: }
+
+// CHECK: ![[ASSOC0]] = !{void ()* @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}, void ()* @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}}
+// CHECK: ![[ASSOC1]] = !{void ()* @_GLOBAL__sub_I__}
+// CHECK: ![[ASSOC2]] = !{void ()* @_GLOBAL__D_a}
Index: clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
===
--- clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
+++ clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
@@ -44,8 +44,13 @@
 A A::instance = bar();
 } // namespace test2
 
+// CHECK: @_ZN5test12t0E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = linkonce_odr global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
 // CHECK: @_ZGVN5test12t2E = linkonce_odr global i64 0, align 8
+// CHECK: @_ZN5test12t1IiEE = linkonce_odr global %"struct.test1::Test1" zero

[PATCH] D122478: [PowerPC] Add max/min intrinsics to Clang and PPC backend

2022-03-25 Thread Ting Wang via Phabricator via cfe-commits
tingwang created this revision.
tingwang added reviewers: PowerPC, jsji, nemanjai, shchenz.
tingwang added a project: LLVM.
Herald added subscribers: kbarton, hiraditya.
Herald added a project: All.
tingwang requested review of this revision.
Herald added a project: clang.
Herald added subscribers: llvm-commits, cfe-commits.

Add support for __builtin_[max|min] which has below prototype:
A __builtin_max (A1, A2, A3, ...)
All arguments must have the same type; they must all be float, double, or long 
double.

Internally use SelectCC to get the result, and depends on D122462 
 to work properly for ppcf128.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D122478

Files:
  clang/include/clang/Basic/BuiltinsPPC.def
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Basic/Targets/PPC.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Sema/SemaChecking.cpp
  clang/test/CodeGen/PowerPC/builtins-ppc.c
  clang/test/Sema/builtins-ppc.c
  llvm/include/llvm/IR/IntrinsicsPowerPC.td
  llvm/lib/Target/PowerPC/PPCISelLowering.cpp
  llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll

Index: llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
===
--- /dev/null
+++ llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
@@ -0,0 +1,150 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mcpu=pwr9 -mtriple=powerpc64le-unknown-linux < %s | FileCheck %s
+
+declare ppc_fp128 @llvm.ppc.maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ...)
+define ppc_fp128 @test_maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d) {
+; CHECK-LABEL: test_maxfe:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 4
+; CHECK-NEXT:fcmpu 1, 5, 3
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_2
+; CHECK-NEXT:  # %bb.1: # %entry
+; CHECK-NEXT:fmr 6, 4
+; CHECK-NEXT:  .LBB0_2: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 2
+; CHECK-NEXT:bc 12, 20, .LBB0_4
+; CHECK-NEXT:  # %bb.3: # %entry
+; CHECK-NEXT:fmr 5, 3
+; CHECK-NEXT:  .LBB0_4: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 1
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_6
+; CHECK-NEXT:  # %bb.5: # %entry
+; CHECK-NEXT:fmr 6, 2
+; CHECK-NEXT:  .LBB0_6: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 8
+; CHECK-NEXT:bc 12, 20, .LBB0_8
+; CHECK-NEXT:  # %bb.7: # %entry
+; CHECK-NEXT:fmr 5, 1
+; CHECK-NEXT:  .LBB0_8: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 7
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_10
+; CHECK-NEXT:  # %bb.9: # %entry
+; CHECK-NEXT:fmr 5, 7
+; CHECK-NEXT:  .LBB0_10: # %entry
+; CHECK-NEXT:bc 12, 20, .LBB0_12
+; CHECK-NEXT:  # %bb.11: # %entry
+; CHECK-NEXT:fmr 6, 8
+; CHECK-NEXT:  .LBB0_12: # %entry
+; CHECK-NEXT:fmr 1, 5
+; CHECK-NEXT:fmr 2, 6
+; CHECK-NEXT:blr
+entry:
+  %0 = call ppc_fp128 (ppc_fp128, ppc_fp128, ppc_fp128, ...) @llvm.ppc.maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d)
+  ret ppc_fp128 %0
+}
+
+declare double @llvm.ppc.maxfl(double %a, double %b, double %c, ...)
+define double @test_maxfl(double %a, double %b, double %c, double %d) {
+; CHECK-LABEL: test_maxfl:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:xsmaxcdp 0, 3, 2
+; CHECK-NEXT:xsmaxcdp 0, 0, 1
+; CHECK-NEXT:xsmaxcdp 1, 0, 4
+; CHECK-NEXT:blr
+entry:
+  %0 = call double (double, double, double, ...) @llvm.ppc.maxfl(double %a, double %b, double %c, double %d)
+  ret double %0
+}
+
+declare float @llvm.ppc.maxfs(float %a, float %b, float %c, ...)
+define float @test_maxfs(float %a, float %b, float %c, float %d) {
+; CHECK-LABEL: test_maxfs:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:xsmaxcdp 0, 3, 2
+; CHECK-NEXT:xsmaxcdp 0, 0, 1
+; CHECK-NEXT:xsmaxcdp 1, 0, 4
+; CHECK-NEXT:blr
+entry:
+  %0 = call float (float, float, float, ...) @llvm.ppc.maxfs(float %a, float %b, float %c, float %d)
+  ret float %0
+}
+
+declare ppc_fp128 @llvm.ppc.minfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ...)
+define ppc_fp128 @test_minfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d) {
+; CHECK-LABEL: test_minfe:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 4
+; CHECK-NEXT:fcmpu 1, 5, 3
+; CHECK-NEXT:crand 20, 6, 0
+; CHECK-NEXT:cror 20, 4, 20
+; CHECK-NEXT:bc 12, 20, .LBB3_2
+; CHECK-NEXT:  # %bb.1: # %entry
+; CHECK-NEXT:fmr 6, 4
+; CHECK-NEXT:  .LBB3_2: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 2
+; CHECK-NEXT:bc 12, 20, .LBB3_4
+; CHECK-NEXT:  # %bb.3: # %entry
+; CHECK-NEXT:fmr 5, 3
+; CHECK-NEXT:  .LBB3_4: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 1
+; CHECK-NEXT:crand 20, 6, 0
+; CHECK-NEXT:cror 20, 4, 20
+; CHECK-NEXT:bc 12, 20, .LBB3_6
+; CHECK-NEXT:  # %bb.5: # %entry
+; CHECK-NEXT:fmr 6, 2
+; CHECK-NEXT:  .L

[PATCH] D122478: [PowerPC] Add max/min intrinsics to Clang and PPC backend

2022-03-25 Thread Ting Wang via Phabricator via cfe-commits
tingwang added a comment.

Since this is compatibility support, I'm trying to match the result from XLC in 
scenarios where there is all kinds of QNaN, SNaN, +/-Infinity, +/-ZERO. 
Currently maxfl and maxfs still give different result compared with XLC in 
above scenario. This is one thing I'm still looking into.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122478/new/

https://reviews.llvm.org/D122478

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D122478: [PowerPC] Add max/min intrinsics to Clang and PPC backend

2022-03-27 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 418485.
tingwang added a comment.

Option -mlong-double-128 is not supported on AIX currently, and clang fails due 
to type mismatch in the fe case. Add check logic to print diag message in this 
case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122478/new/

https://reviews.llvm.org/D122478

Files:
  clang/include/clang/Basic/BuiltinsPPC.def
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Basic/Targets/PPC.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Sema/SemaChecking.cpp
  clang/test/CodeGen/PowerPC/builtins-ppc.c
  clang/test/Sema/builtins-ppc.c
  llvm/include/llvm/IR/IntrinsicsPowerPC.td
  llvm/lib/Target/PowerPC/PPCISelLowering.cpp
  llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll

Index: llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
===
--- /dev/null
+++ llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
@@ -0,0 +1,150 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mcpu=pwr9 -mtriple=powerpc64le-unknown-linux < %s | FileCheck %s
+
+declare ppc_fp128 @llvm.ppc.maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ...)
+define ppc_fp128 @test_maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d) {
+; CHECK-LABEL: test_maxfe:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 4
+; CHECK-NEXT:fcmpu 1, 5, 3
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_2
+; CHECK-NEXT:  # %bb.1: # %entry
+; CHECK-NEXT:fmr 6, 4
+; CHECK-NEXT:  .LBB0_2: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 2
+; CHECK-NEXT:bc 12, 20, .LBB0_4
+; CHECK-NEXT:  # %bb.3: # %entry
+; CHECK-NEXT:fmr 5, 3
+; CHECK-NEXT:  .LBB0_4: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 1
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_6
+; CHECK-NEXT:  # %bb.5: # %entry
+; CHECK-NEXT:fmr 6, 2
+; CHECK-NEXT:  .LBB0_6: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 8
+; CHECK-NEXT:bc 12, 20, .LBB0_8
+; CHECK-NEXT:  # %bb.7: # %entry
+; CHECK-NEXT:fmr 5, 1
+; CHECK-NEXT:  .LBB0_8: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 7
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_10
+; CHECK-NEXT:  # %bb.9: # %entry
+; CHECK-NEXT:fmr 5, 7
+; CHECK-NEXT:  .LBB0_10: # %entry
+; CHECK-NEXT:bc 12, 20, .LBB0_12
+; CHECK-NEXT:  # %bb.11: # %entry
+; CHECK-NEXT:fmr 6, 8
+; CHECK-NEXT:  .LBB0_12: # %entry
+; CHECK-NEXT:fmr 1, 5
+; CHECK-NEXT:fmr 2, 6
+; CHECK-NEXT:blr
+entry:
+  %0 = call ppc_fp128 (ppc_fp128, ppc_fp128, ppc_fp128, ...) @llvm.ppc.maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d)
+  ret ppc_fp128 %0
+}
+
+declare double @llvm.ppc.maxfl(double %a, double %b, double %c, ...)
+define double @test_maxfl(double %a, double %b, double %c, double %d) {
+; CHECK-LABEL: test_maxfl:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:xsmaxcdp 0, 3, 2
+; CHECK-NEXT:xsmaxcdp 0, 0, 1
+; CHECK-NEXT:xsmaxcdp 1, 0, 4
+; CHECK-NEXT:blr
+entry:
+  %0 = call double (double, double, double, ...) @llvm.ppc.maxfl(double %a, double %b, double %c, double %d)
+  ret double %0
+}
+
+declare float @llvm.ppc.maxfs(float %a, float %b, float %c, ...)
+define float @test_maxfs(float %a, float %b, float %c, float %d) {
+; CHECK-LABEL: test_maxfs:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:xsmaxcdp 0, 3, 2
+; CHECK-NEXT:xsmaxcdp 0, 0, 1
+; CHECK-NEXT:xsmaxcdp 1, 0, 4
+; CHECK-NEXT:blr
+entry:
+  %0 = call float (float, float, float, ...) @llvm.ppc.maxfs(float %a, float %b, float %c, float %d)
+  ret float %0
+}
+
+declare ppc_fp128 @llvm.ppc.minfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ...)
+define ppc_fp128 @test_minfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d) {
+; CHECK-LABEL: test_minfe:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 4
+; CHECK-NEXT:fcmpu 1, 5, 3
+; CHECK-NEXT:crand 20, 6, 0
+; CHECK-NEXT:cror 20, 4, 20
+; CHECK-NEXT:bc 12, 20, .LBB3_2
+; CHECK-NEXT:  # %bb.1: # %entry
+; CHECK-NEXT:fmr 6, 4
+; CHECK-NEXT:  .LBB3_2: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 2
+; CHECK-NEXT:bc 12, 20, .LBB3_4
+; CHECK-NEXT:  # %bb.3: # %entry
+; CHECK-NEXT:fmr 5, 3
+; CHECK-NEXT:  .LBB3_4: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 1
+; CHECK-NEXT:crand 20, 6, 0
+; CHECK-NEXT:cror 20, 4, 20
+; CHECK-NEXT:bc 12, 20, .LBB3_6
+; CHECK-NEXT:  # %bb.5: # %entry
+; CHECK-NEXT:fmr 6, 2
+; CHECK-NEXT:  .LBB3_6: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 8
+; CHECK-NEXT:bc 12, 20, .LBB3_8
+; CHECK-NEXT:  # %bb.7: # %entry
+; CHECK-NEXT:fmr 5, 1
+; CHECK-NEXT:  .LBB3_8: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 7
+; CHECK-NEXT:crand 20, 6, 0
+; CHECK-NEXT:cror 20, 4, 20
+; CHECK-NEXT:bc 12, 20, .LBB3_10
+; CHECK-NEXT:  # %bb.

[PATCH] D122478: [PowerPC] Add max/min intrinsics to Clang and PPC backend

2022-03-28 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 418509.
tingwang added a comment.

Update test case to show that we need D122462 
 to fix the crash.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122478/new/

https://reviews.llvm.org/D122478

Files:
  clang/include/clang/Basic/BuiltinsPPC.def
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Basic/Targets/PPC.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Sema/SemaChecking.cpp
  clang/test/CodeGen/PowerPC/builtins-ppc.c
  clang/test/Sema/builtins-ppc.c
  llvm/include/llvm/IR/IntrinsicsPowerPC.td
  llvm/lib/Target/PowerPC/PPCISelLowering.cpp
  llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
  llvm/test/CodeGen/PowerPC/maxmin-select-cc.ll

Index: llvm/test/CodeGen/PowerPC/maxmin-select-cc.ll
===
--- /dev/null
+++ llvm/test/CodeGen/PowerPC/maxmin-select-cc.ll
@@ -0,0 +1,12 @@
+; REQUIRES: asserts
+; RUN: not --crash llc -verify-machineinstrs -mcpu=pwr9 -mtriple=powerpc64-unknown-unknown \
+; RUN:   < %s 2>&1 | FileCheck %s
+
+declare ppc_fp128 @llvm.ppc.maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ...)
+define ppc_fp128 @test_maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d) {
+entry:
+; CHECK: Do not know how to custom type legalize this operation!
+; CHECK: UNREACHABLE executed at {{.*}}
+  %0 = call ppc_fp128 (ppc_fp128, ppc_fp128, ppc_fp128, ...) @llvm.ppc.maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d)
+  ret ppc_fp128 %0
+}
Index: llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
===
--- /dev/null
+++ llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
@@ -0,0 +1,54 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mcpu=pwr9 -mtriple=powerpc64le-unknown-linux < %s | FileCheck %s
+
+declare double @llvm.ppc.maxfl(double %a, double %b, double %c, ...)
+define double @test_maxfl(double %a, double %b, double %c, double %d) {
+; CHECK-LABEL: test_maxfl:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:xsmaxcdp 0, 3, 2
+; CHECK-NEXT:xsmaxcdp 0, 0, 1
+; CHECK-NEXT:xsmaxcdp 1, 0, 4
+; CHECK-NEXT:blr
+entry:
+  %0 = call double (double, double, double, ...) @llvm.ppc.maxfl(double %a, double %b, double %c, double %d)
+  ret double %0
+}
+
+declare float @llvm.ppc.maxfs(float %a, float %b, float %c, ...)
+define float @test_maxfs(float %a, float %b, float %c, float %d) {
+; CHECK-LABEL: test_maxfs:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:xsmaxcdp 0, 3, 2
+; CHECK-NEXT:xsmaxcdp 0, 0, 1
+; CHECK-NEXT:xsmaxcdp 1, 0, 4
+; CHECK-NEXT:blr
+entry:
+  %0 = call float (float, float, float, ...) @llvm.ppc.maxfs(float %a, float %b, float %c, float %d)
+  ret float %0
+}
+
+declare double @llvm.ppc.minfl(double %a, double %b, double %c, ...)
+define double @test_minfl(double %a, double %b, double %c, double %d) {
+; CHECK-LABEL: test_minfl:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:xsmincdp 0, 3, 2
+; CHECK-NEXT:xsmincdp 0, 0, 1
+; CHECK-NEXT:xsmincdp 1, 0, 4
+; CHECK-NEXT:blr
+entry:
+  %0 = call double (double, double, double, ...) @llvm.ppc.minfl(double %a, double %b, double %c, double %d)
+  ret double %0
+}
+
+declare float @llvm.ppc.minfs(float %a, float %b, float %c, ...)
+define float @test_minfs(float %a, float %b, float %c, float %d) {
+; CHECK-LABEL: test_minfs:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:xsmincdp 0, 3, 2
+; CHECK-NEXT:xsmincdp 0, 0, 1
+; CHECK-NEXT:xsmincdp 1, 0, 4
+; CHECK-NEXT:blr
+entry:
+  %0 = call float (float, float, float, ...) @llvm.ppc.minfs(float %a, float %b, float %c, float %d)
+  ret float %0
+}
Index: llvm/lib/Target/PowerPC/PPCISelLowering.cpp
===
--- llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -10574,6 +10574,34 @@
 dl, SDValue());
 return Result.first;
   }
+  case Intrinsic::ppc_maxfe:
+  case Intrinsic::ppc_maxfl:
+  case Intrinsic::ppc_maxfs:
+  case Intrinsic::ppc_minfe:
+  case Intrinsic::ppc_minfl:
+  case Intrinsic::ppc_minfs: {
+for (unsigned i = 4, e = Op.getNumOperands(); i < e; ++i) {
+  if (Op.getOperand(i).getValueType() != Op.getValueType())
+report_fatal_error("Intrinsic::ppc_[max|min]f[e|l|s] must have uniform "
+   "type arguments");
+}
+ISD::CondCode CC = ISD::SETGT;
+if (IntrinsicID == Intrinsic::ppc_minfe ||
+IntrinsicID == Intrinsic::ppc_minfl ||
+IntrinsicID == Intrinsic::ppc_minfs)
+  CC = ISD::SETLT;
+// Below selection order follows XLC behavior: start from the last but one
+// operand, move towards the first operand, end with the last operand.
+unsigned I, Cnt;
+I = Cnt = Op.getNumOperands() -

[PATCH] D122478: [PowerPC] Add max/min intrinsics to Clang and PPC backend

2022-03-28 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 418544.
tingwang marked 11 inline comments as done.
tingwang added a comment.

Update based on Chaofan's suggestions.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122478/new/

https://reviews.llvm.org/D122478

Files:
  clang/include/clang/Basic/BuiltinsPPC.def
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Basic/Targets/PPC.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Sema/SemaChecking.cpp
  clang/test/CodeGen/PowerPC/builtins-ppc.c
  clang/test/Sema/builtins-ppc.c
  llvm/include/llvm/IR/IntrinsicsPowerPC.td
  llvm/lib/Target/PowerPC/PPCISelLowering.cpp
  llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll

Index: llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
===
--- /dev/null
+++ llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
@@ -0,0 +1,257 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mcpu=pwr9 -verify-machineinstrs -mtriple=powerpc64le-unknown-linux \
+; RUN:< %s | FileCheck --check-prefixes=CHECK,CHECK-P9 %s
+; RUN: llc -mcpu=pwr8 -verify-machineinstrs -mtriple=powerpc64le-unknown-linux \
+; RUN:< %s | FileCheck --check-prefixes=CHECK,CHECK-P8 %s
+
+declare ppc_fp128 @llvm.ppc.maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ...)
+define ppc_fp128 @test_maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d) {
+; CHECK-LABEL: test_maxfe:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 4
+; CHECK-NEXT:fcmpu 1, 5, 3
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_2
+; CHECK-NEXT:  # %bb.1: # %entry
+; CHECK-NEXT:fmr 6, 4
+; CHECK-NEXT:  .LBB0_2: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 2
+; CHECK-NEXT:bc 12, 20, .LBB0_4
+; CHECK-NEXT:  # %bb.3: # %entry
+; CHECK-NEXT:fmr 5, 3
+; CHECK-NEXT:  .LBB0_4: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 1
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_6
+; CHECK-NEXT:  # %bb.5: # %entry
+; CHECK-NEXT:fmr 6, 2
+; CHECK-NEXT:  .LBB0_6: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 8
+; CHECK-NEXT:bc 12, 20, .LBB0_8
+; CHECK-NEXT:  # %bb.7: # %entry
+; CHECK-NEXT:fmr 5, 1
+; CHECK-NEXT:  .LBB0_8: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 7
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_10
+; CHECK-NEXT:  # %bb.9: # %entry
+; CHECK-NEXT:fmr 5, 7
+; CHECK-NEXT:  .LBB0_10: # %entry
+; CHECK-NEXT:bc 12, 20, .LBB0_12
+; CHECK-NEXT:  # %bb.11: # %entry
+; CHECK-NEXT:fmr 6, 8
+; CHECK-NEXT:  .LBB0_12: # %entry
+; CHECK-NEXT:fmr 1, 5
+; CHECK-NEXT:fmr 2, 6
+; CHECK-NEXT:blr
+entry:
+  %0 = call ppc_fp128 (ppc_fp128, ppc_fp128, ppc_fp128, ...) @llvm.ppc.maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d)
+  ret ppc_fp128 %0
+}
+
+declare double @llvm.ppc.maxfl(double %a, double %b, double %c, ...)
+define double @test_maxfl(double %a, double %b, double %c, double %d) {
+; CHECK-P9-LABEL: test_maxfl:
+; CHECK-P9:   # %bb.0: # %entry
+; CHECK-P9-NEXT:xsmaxcdp 0, 3, 2
+; CHECK-P9-NEXT:xsmaxcdp 0, 0, 1
+; CHECK-P9-NEXT:xsmaxcdp 1, 0, 4
+; CHECK-P9-NEXT:blr
+;
+; CHECK-P8-LABEL: test_maxfl:
+; CHECK-P8:   # %bb.0: # %entry
+; CHECK-P8-NEXT:xscmpudp 0, 3, 2
+; CHECK-P8-NEXT:ble 0, .LBB1_4
+; CHECK-P8-NEXT:  # %bb.1: # %entry
+; CHECK-P8-NEXT:xscmpudp 0, 3, 1
+; CHECK-P8-NEXT:ble 0, .LBB1_5
+; CHECK-P8-NEXT:  .LBB1_2: # %entry
+; CHECK-P8-NEXT:xscmpudp 0, 3, 4
+; CHECK-P8-NEXT:ble 0, .LBB1_6
+; CHECK-P8-NEXT:  .LBB1_3: # %entry
+; CHECK-P8-NEXT:fmr 1, 3
+; CHECK-P8-NEXT:blr
+; CHECK-P8-NEXT:  .LBB1_4: # %entry
+; CHECK-P8-NEXT:fmr 3, 2
+; CHECK-P8-NEXT:xscmpudp 0, 3, 1
+; CHECK-P8-NEXT:bgt 0, .LBB1_2
+; CHECK-P8-NEXT:  .LBB1_5: # %entry
+; CHECK-P8-NEXT:fmr 3, 1
+; CHECK-P8-NEXT:xscmpudp 0, 3, 4
+; CHECK-P8-NEXT:bgt 0, .LBB1_3
+; CHECK-P8-NEXT:  .LBB1_6: # %entry
+; CHECK-P8-NEXT:fmr 3, 4
+; CHECK-P8-NEXT:fmr 1, 3
+; CHECK-P8-NEXT:blr
+entry:
+  %0 = call double (double, double, double, ...) @llvm.ppc.maxfl(double %a, double %b, double %c, double %d)
+  ret double %0
+}
+
+declare float @llvm.ppc.maxfs(float %a, float %b, float %c, ...)
+define float @test_maxfs(float %a, float %b, float %c, float %d) {
+; CHECK-P9-LABEL: test_maxfs:
+; CHECK-P9:   # %bb.0: # %entry
+; CHECK-P9-NEXT:xsmaxcdp 0, 3, 2
+; CHECK-P9-NEXT:xsmaxcdp 0, 0, 1
+; CHECK-P9-NEXT:xsmaxcdp 1, 0, 4
+; CHECK-P9-NEXT:blr
+;
+; CHECK-P8-LABEL: test_maxfs:
+; CHECK-P8:   # %bb.0: # %entry
+; CHECK-P8-NEXT:fcmpu 0, 3, 2
+; CHECK-P8-NEXT:ble 0, .LBB2_4
+; CHECK-P8-NEXT:  # %bb.1: # %entry
+; CHECK-P8-NEXT:fcmpu 0, 3, 1
+; CHECK-P8-NEXT:ble 0, .LBB2_5
+; CHECK-P8-NEXT:  .LBB2_2: # %entry
+; CHECK-P8-NEXT:fcmpu 

[PATCH] D122478: [PowerPC] Add max/min intrinsics to Clang and PPC backend

2022-03-28 Thread Ting Wang via Phabricator via cfe-commits
tingwang added inline comments.



Comment at: llvm/include/llvm/IR/IntrinsicsPowerPC.td:192
+  [llvm_float_ty, llvm_float_ty, llvm_float_ty, 
llvm_vararg_ty],
+  [IntrNoMem]>;
 }

qiucf wrote:
> Will we support `llvm_f128_ty`?
I'm afraid not at this moment. Document mentions only three types: float, 
double, or long double.



Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:10598-10602
+for (--I; Cnt != 0; --Cnt, I = (--I == 0 ? (Op.getNumOperands() - 1) : I)) 
{
+  Res = LowerSELECT_CC(
+  DAG.getSelectCC(dl, Res, Op.getOperand(I), Res, Op.getOperand(I), 
CC),
+  DAG);
+}

qiucf wrote:
> I don't think we need to manually call `LowerSELECT_CC` here. SelectionDAG 
> knows `ppc_fp128` should not be custom lowered.
> 
> This also makes the case pass. Thus D122462 is not needed.
Thank you for pointing out this. Verified LowerSELECT_CC is not required here. 
This was added due to misconception.



Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:11266
+case Intrinsic::ppc_maxfe:
+case Intrinsic::ppc_minfe:
 case Intrinsic::ppc_fnmsub:

qiucf wrote:
> Why only two `fe`?
This is only for ppcf128 during type legalization: 
DAGTypeLegalizer::ExpandFloatResult -> CustomLowerNode ->  
PPCTargetLowering::ReplaceNodeResults. The other cases seem not hitting here. I 
will double check the code path to verify.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122478/new/

https://reviews.llvm.org/D122478

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D122478: [PowerPC] Add max/min intrinsics to Clang and PPC backend

2022-04-01 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 419704.
tingwang added a comment.

Update based on comments:
(1) Reuse diag error message.
(2) Update clang test case for those diag messages.
(3) Add TODO comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122478/new/

https://reviews.llvm.org/D122478

Files:
  clang/include/clang/Basic/BuiltinsPPC.def
  clang/lib/Basic/Targets/PPC.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Sema/SemaChecking.cpp
  clang/test/CodeGen/PowerPC/builtins-ppc.c
  clang/test/Sema/builtins-ppc.c
  llvm/include/llvm/IR/IntrinsicsPowerPC.td
  llvm/lib/Target/PowerPC/PPCISelLowering.cpp
  llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll

Index: llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
===
--- /dev/null
+++ llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
@@ -0,0 +1,257 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mcpu=pwr9 -verify-machineinstrs -mtriple=powerpc64le-unknown-linux \
+; RUN:< %s | FileCheck --check-prefixes=CHECK,CHECK-P9 %s
+; RUN: llc -mcpu=pwr8 -verify-machineinstrs -mtriple=powerpc64le-unknown-linux \
+; RUN:< %s | FileCheck --check-prefixes=CHECK,CHECK-P8 %s
+
+declare ppc_fp128 @llvm.ppc.maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ...)
+define ppc_fp128 @test_maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d) {
+; CHECK-LABEL: test_maxfe:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 4
+; CHECK-NEXT:fcmpu 1, 5, 3
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_2
+; CHECK-NEXT:  # %bb.1: # %entry
+; CHECK-NEXT:fmr 6, 4
+; CHECK-NEXT:  .LBB0_2: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 2
+; CHECK-NEXT:bc 12, 20, .LBB0_4
+; CHECK-NEXT:  # %bb.3: # %entry
+; CHECK-NEXT:fmr 5, 3
+; CHECK-NEXT:  .LBB0_4: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 1
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_6
+; CHECK-NEXT:  # %bb.5: # %entry
+; CHECK-NEXT:fmr 6, 2
+; CHECK-NEXT:  .LBB0_6: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 8
+; CHECK-NEXT:bc 12, 20, .LBB0_8
+; CHECK-NEXT:  # %bb.7: # %entry
+; CHECK-NEXT:fmr 5, 1
+; CHECK-NEXT:  .LBB0_8: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 7
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_10
+; CHECK-NEXT:  # %bb.9: # %entry
+; CHECK-NEXT:fmr 5, 7
+; CHECK-NEXT:  .LBB0_10: # %entry
+; CHECK-NEXT:bc 12, 20, .LBB0_12
+; CHECK-NEXT:  # %bb.11: # %entry
+; CHECK-NEXT:fmr 6, 8
+; CHECK-NEXT:  .LBB0_12: # %entry
+; CHECK-NEXT:fmr 1, 5
+; CHECK-NEXT:fmr 2, 6
+; CHECK-NEXT:blr
+entry:
+  %0 = call ppc_fp128 (ppc_fp128, ppc_fp128, ppc_fp128, ...) @llvm.ppc.maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d)
+  ret ppc_fp128 %0
+}
+
+declare double @llvm.ppc.maxfl(double %a, double %b, double %c, ...)
+define double @test_maxfl(double %a, double %b, double %c, double %d) {
+; CHECK-P9-LABEL: test_maxfl:
+; CHECK-P9:   # %bb.0: # %entry
+; CHECK-P9-NEXT:xsmaxcdp 0, 3, 2
+; CHECK-P9-NEXT:xsmaxcdp 0, 0, 1
+; CHECK-P9-NEXT:xsmaxcdp 1, 0, 4
+; CHECK-P9-NEXT:blr
+;
+; CHECK-P8-LABEL: test_maxfl:
+; CHECK-P8:   # %bb.0: # %entry
+; CHECK-P8-NEXT:xscmpudp 0, 3, 2
+; CHECK-P8-NEXT:ble 0, .LBB1_4
+; CHECK-P8-NEXT:  # %bb.1: # %entry
+; CHECK-P8-NEXT:xscmpudp 0, 3, 1
+; CHECK-P8-NEXT:ble 0, .LBB1_5
+; CHECK-P8-NEXT:  .LBB1_2: # %entry
+; CHECK-P8-NEXT:xscmpudp 0, 3, 4
+; CHECK-P8-NEXT:ble 0, .LBB1_6
+; CHECK-P8-NEXT:  .LBB1_3: # %entry
+; CHECK-P8-NEXT:fmr 1, 3
+; CHECK-P8-NEXT:blr
+; CHECK-P8-NEXT:  .LBB1_4: # %entry
+; CHECK-P8-NEXT:fmr 3, 2
+; CHECK-P8-NEXT:xscmpudp 0, 3, 1
+; CHECK-P8-NEXT:bgt 0, .LBB1_2
+; CHECK-P8-NEXT:  .LBB1_5: # %entry
+; CHECK-P8-NEXT:fmr 3, 1
+; CHECK-P8-NEXT:xscmpudp 0, 3, 4
+; CHECK-P8-NEXT:bgt 0, .LBB1_3
+; CHECK-P8-NEXT:  .LBB1_6: # %entry
+; CHECK-P8-NEXT:fmr 3, 4
+; CHECK-P8-NEXT:fmr 1, 3
+; CHECK-P8-NEXT:blr
+entry:
+  %0 = call double (double, double, double, ...) @llvm.ppc.maxfl(double %a, double %b, double %c, double %d)
+  ret double %0
+}
+
+declare float @llvm.ppc.maxfs(float %a, float %b, float %c, ...)
+define float @test_maxfs(float %a, float %b, float %c, float %d) {
+; CHECK-P9-LABEL: test_maxfs:
+; CHECK-P9:   # %bb.0: # %entry
+; CHECK-P9-NEXT:xsmaxcdp 0, 3, 2
+; CHECK-P9-NEXT:xsmaxcdp 0, 0, 1
+; CHECK-P9-NEXT:xsmaxcdp 1, 0, 4
+; CHECK-P9-NEXT:blr
+;
+; CHECK-P8-LABEL: test_maxfs:
+; CHECK-P8:   # %bb.0: # %entry
+; CHECK-P8-NEXT:fcmpu 0, 3, 2
+; CHECK-P8-NEXT:ble 0, .LBB2_4
+; CHECK-P8-NEXT:  # %bb.1: # %entry
+; CHECK-P8-NEXT:fcmpu 0, 3, 1
+; CHECK-P8-NEXT:ble 0, .LBB2_5
+; CHECK-P8-NEXT:  .LBB2_2: # %entry
+; CHECK-P8-NEXT:fcmpu 0, 3

[PATCH] D122478: [PowerPC] Add max/min intrinsics to Clang and PPC backend

2022-04-01 Thread Ting Wang via Phabricator via cfe-commits
tingwang marked an inline comment as done.
tingwang added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:9897
+def err_ppc_unsupported_argument_type : Error<
+  "unsupported argument type %0 for target %1">;
 def err_x86_builtin_invalid_rounding : Error<

qiucf wrote:
> Use `err_target_unsupported_type`?
Thanks again for pointing it out!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122478/new/

https://reviews.llvm.org/D122478

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D122478: [PowerPC] Add max/min intrinsics to Clang and PPC backend

2022-04-05 Thread Ting Wang via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGb389354b2857: [Clang][PowerPC] Add max/min intrinsics to 
Clang and PPC backend (authored by tingwang).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122478/new/

https://reviews.llvm.org/D122478

Files:
  clang/include/clang/Basic/BuiltinsPPC.def
  clang/lib/Basic/Targets/PPC.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Sema/SemaChecking.cpp
  clang/test/CodeGen/PowerPC/builtins-ppc.c
  clang/test/Sema/builtins-ppc.c
  llvm/include/llvm/IR/IntrinsicsPowerPC.td
  llvm/lib/Target/PowerPC/PPCISelLowering.cpp
  llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll

Index: llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
===
--- /dev/null
+++ llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-maxmin.ll
@@ -0,0 +1,257 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mcpu=pwr9 -verify-machineinstrs -mtriple=powerpc64le-unknown-linux \
+; RUN:< %s | FileCheck --check-prefixes=CHECK,CHECK-P9 %s
+; RUN: llc -mcpu=pwr8 -verify-machineinstrs -mtriple=powerpc64le-unknown-linux \
+; RUN:< %s | FileCheck --check-prefixes=CHECK,CHECK-P8 %s
+
+declare ppc_fp128 @llvm.ppc.maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ...)
+define ppc_fp128 @test_maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d) {
+; CHECK-LABEL: test_maxfe:
+; CHECK:   # %bb.0: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 4
+; CHECK-NEXT:fcmpu 1, 5, 3
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_2
+; CHECK-NEXT:  # %bb.1: # %entry
+; CHECK-NEXT:fmr 6, 4
+; CHECK-NEXT:  .LBB0_2: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 2
+; CHECK-NEXT:bc 12, 20, .LBB0_4
+; CHECK-NEXT:  # %bb.3: # %entry
+; CHECK-NEXT:fmr 5, 3
+; CHECK-NEXT:  .LBB0_4: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 1
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_6
+; CHECK-NEXT:  # %bb.5: # %entry
+; CHECK-NEXT:fmr 6, 2
+; CHECK-NEXT:  .LBB0_6: # %entry
+; CHECK-NEXT:fcmpu 0, 6, 8
+; CHECK-NEXT:bc 12, 20, .LBB0_8
+; CHECK-NEXT:  # %bb.7: # %entry
+; CHECK-NEXT:fmr 5, 1
+; CHECK-NEXT:  .LBB0_8: # %entry
+; CHECK-NEXT:fcmpu 1, 5, 7
+; CHECK-NEXT:crand 20, 6, 1
+; CHECK-NEXT:cror 20, 5, 20
+; CHECK-NEXT:bc 12, 20, .LBB0_10
+; CHECK-NEXT:  # %bb.9: # %entry
+; CHECK-NEXT:fmr 5, 7
+; CHECK-NEXT:  .LBB0_10: # %entry
+; CHECK-NEXT:bc 12, 20, .LBB0_12
+; CHECK-NEXT:  # %bb.11: # %entry
+; CHECK-NEXT:fmr 6, 8
+; CHECK-NEXT:  .LBB0_12: # %entry
+; CHECK-NEXT:fmr 1, 5
+; CHECK-NEXT:fmr 2, 6
+; CHECK-NEXT:blr
+entry:
+  %0 = call ppc_fp128 (ppc_fp128, ppc_fp128, ppc_fp128, ...) @llvm.ppc.maxfe(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c, ppc_fp128 %d)
+  ret ppc_fp128 %0
+}
+
+declare double @llvm.ppc.maxfl(double %a, double %b, double %c, ...)
+define double @test_maxfl(double %a, double %b, double %c, double %d) {
+; CHECK-P9-LABEL: test_maxfl:
+; CHECK-P9:   # %bb.0: # %entry
+; CHECK-P9-NEXT:xsmaxcdp 0, 3, 2
+; CHECK-P9-NEXT:xsmaxcdp 0, 0, 1
+; CHECK-P9-NEXT:xsmaxcdp 1, 0, 4
+; CHECK-P9-NEXT:blr
+;
+; CHECK-P8-LABEL: test_maxfl:
+; CHECK-P8:   # %bb.0: # %entry
+; CHECK-P8-NEXT:xscmpudp 0, 3, 2
+; CHECK-P8-NEXT:ble 0, .LBB1_4
+; CHECK-P8-NEXT:  # %bb.1: # %entry
+; CHECK-P8-NEXT:xscmpudp 0, 3, 1
+; CHECK-P8-NEXT:ble 0, .LBB1_5
+; CHECK-P8-NEXT:  .LBB1_2: # %entry
+; CHECK-P8-NEXT:xscmpudp 0, 3, 4
+; CHECK-P8-NEXT:ble 0, .LBB1_6
+; CHECK-P8-NEXT:  .LBB1_3: # %entry
+; CHECK-P8-NEXT:fmr 1, 3
+; CHECK-P8-NEXT:blr
+; CHECK-P8-NEXT:  .LBB1_4: # %entry
+; CHECK-P8-NEXT:fmr 3, 2
+; CHECK-P8-NEXT:xscmpudp 0, 3, 1
+; CHECK-P8-NEXT:bgt 0, .LBB1_2
+; CHECK-P8-NEXT:  .LBB1_5: # %entry
+; CHECK-P8-NEXT:fmr 3, 1
+; CHECK-P8-NEXT:xscmpudp 0, 3, 4
+; CHECK-P8-NEXT:bgt 0, .LBB1_3
+; CHECK-P8-NEXT:  .LBB1_6: # %entry
+; CHECK-P8-NEXT:fmr 3, 4
+; CHECK-P8-NEXT:fmr 1, 3
+; CHECK-P8-NEXT:blr
+entry:
+  %0 = call double (double, double, double, ...) @llvm.ppc.maxfl(double %a, double %b, double %c, double %d)
+  ret double %0
+}
+
+declare float @llvm.ppc.maxfs(float %a, float %b, float %c, ...)
+define float @test_maxfs(float %a, float %b, float %c, float %d) {
+; CHECK-P9-LABEL: test_maxfs:
+; CHECK-P9:   # %bb.0: # %entry
+; CHECK-P9-NEXT:xsmaxcdp 0, 3, 2
+; CHECK-P9-NEXT:xsmaxcdp 0, 0, 1
+; CHECK-P9-NEXT:xsmaxcdp 1, 0, 4
+; CHECK-P9-NEXT:blr
+;
+; CHECK-P8-LABEL: test_maxfs:
+; CHECK-P8:   # %bb.0: # %entry
+; CHECK-P8-NEXT:fcmpu 0, 3, 2
+; CHECK-P8-NEXT:ble 0, .LBB2_4
+; CHECK-P8-NEXT:  # %bb.1: # %entry
+; CHECK-P8-NEXT:fcmpu 0, 3, 1
+; CHECK-P8-NEXT:ble 0, .LBB2_5
+; CHECK-P8-NEXT:  .

[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2022-05-06 Thread Ting Wang via Phabricator via cfe-commits
tingwang created this revision.
tingwang added reviewers: jsji, nemanjai, shchenz, hubert.reinterpretcast, 
PowerPC.
tingwang added projects: LLVM, clang.
Herald added a project: All.
tingwang requested review of this revision.
Herald added a subscriber: cfe-commits.

This is the frontend part of .ref enablement. It works with D122198 
, and implements below items:

(1) variable to init/term functions (required to add functions to _cdtors array 
if the variable is included in linker output)
(2) dtor function to term function (required to correctly handle 
atexit/unatexit registration/unregistration)


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D125095

Files:
  clang/lib/CodeGen/CGDecl.cpp
  clang/lib/CodeGen/CGDeclCXX.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp
  clang/test/CodeGen/PowerPC/aix-ref-static-var.cpp
  clang/test/CodeGen/PowerPC/aix-ref-tls_init.cpp
  clang/test/CodeGenCXX/aix-static-init-debug-info.cpp
  clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
  clang/test/CodeGenCXX/aix-static-init.cpp

Index: clang/test/CodeGenCXX/aix-static-init.cpp
===
--- clang/test/CodeGenCXX/aix-static-init.cpp
+++ clang/test/CodeGenCXX/aix-static-init.cpp
@@ -38,6 +38,10 @@
   }
 } // namespace test4
 
+// CHECK: @_ZN5test12t1E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test21xE = global i32 0, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
+// CHECK: @_ZN5test31tE = global %"struct.test3::Test3" undef, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
 // CHECK: @_ZGVZN5test41fEvE11staticLocal = internal global i64 0, align 8
 // CHECK: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__sub_I__, i8* null }]
 // CHECK: @llvm.global_dtors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__D_a, i8* null }]
@@ -49,7 +53,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(%"struct.test1::Test1"* @_ZN5test12t1E)
 // CHECK:   ret void
@@ -80,7 +84,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(%"struct.test1::Test1"* @_ZN5test12t2E)
 // CHECK:   ret void
@@ -114,7 +118,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test35Test3D1Ev(%"struct.test3::Test3"* @_ZN5test31tE)
 // CHECK:   ret void
@@ -155,7 +159,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test45Test4D1Ev(%"struct.test4::Test4"* @_ZZN5test41fEvE11staticLocal)
 // CHECK:   ret void
@@ -192,3 +196,7 @@
 // CHECK:   call void @__finalize__ZN5test12t1E()
 // CHECK:   ret void
 // CHECK: }
+
+// CHECK: ![[ASSOC0]] = !{void ()* @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}, void ()* @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}}
+// CHECK: ![[ASSOC1]] = !{void ()* @_GLOBAL__sub_I__}
+// CHECK: ![[ASSOC2]] = !{void ()* @_GLOBAL__D_a}
Index: clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
===
--- clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
+++ clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
@@ -44,8 +44,13 @@
 A A::instance = bar();
 } // namespace test2
 
+// CHECK: @_ZN5test12t0E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = linkonce_odr global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
 // CHECK: @_ZGVN5test12t2E = linkonce_odr global i64 0, align 8
+// CHECK: @_ZN5test12t1IiEE = linkonce_odr global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC2:[0-9]+]]
+// CHECK: @_ZN5test21AIvE8instanceE = weak_odr globa

[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2022-05-09 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 428268.
tingwang added a comment.

Update the three test cases introduced in this patch to use opaque-pointer
clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp
clang/test/CodeGen/PowerPC/aix-ref-static-var.cpp
clang/test/CodeGen/PowerPC/aix-ref-tls_init.cpp


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

Files:
  clang/lib/CodeGen/CGDecl.cpp
  clang/lib/CodeGen/CGDeclCXX.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp
  clang/test/CodeGen/PowerPC/aix-ref-static-var.cpp
  clang/test/CodeGen/PowerPC/aix-ref-tls_init.cpp
  clang/test/CodeGenCXX/aix-static-init-debug-info.cpp
  clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
  clang/test/CodeGenCXX/aix-static-init.cpp

Index: clang/test/CodeGenCXX/aix-static-init.cpp
===
--- clang/test/CodeGenCXX/aix-static-init.cpp
+++ clang/test/CodeGenCXX/aix-static-init.cpp
@@ -38,6 +38,10 @@
   }
 } // namespace test4
 
+// CHECK: @_ZN5test12t1E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test21xE = global i32 0, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
+// CHECK: @_ZN5test31tE = global %"struct.test3::Test3" undef, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
 // CHECK: @_ZGVZN5test41fEvE11staticLocal = internal global i64 0, align 8
 // CHECK: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__sub_I__, i8* null }]
 // CHECK: @llvm.global_dtors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__D_a, i8* null }]
@@ -49,7 +53,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(%"struct.test1::Test1"* @_ZN5test12t1E)
 // CHECK:   ret void
@@ -80,7 +84,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(%"struct.test1::Test1"* @_ZN5test12t2E)
 // CHECK:   ret void
@@ -114,7 +118,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test35Test3D1Ev(%"struct.test3::Test3"* @_ZN5test31tE)
 // CHECK:   ret void
@@ -155,7 +159,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test45Test4D1Ev(%"struct.test4::Test4"* @_ZZN5test41fEvE11staticLocal)
 // CHECK:   ret void
@@ -192,3 +196,7 @@
 // CHECK:   call void @__finalize__ZN5test12t1E()
 // CHECK:   ret void
 // CHECK: }
+
+// CHECK: ![[ASSOC0]] = !{void ()* @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}, void ()* @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}}
+// CHECK: ![[ASSOC1]] = !{void ()* @_GLOBAL__sub_I__}
+// CHECK: ![[ASSOC2]] = !{void ()* @_GLOBAL__D_a}
Index: clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
===
--- clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
+++ clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
@@ -44,8 +44,13 @@
 A A::instance = bar();
 } // namespace test2
 
+// CHECK: @_ZN5test12t0E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = linkonce_odr global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
 // CHECK: @_ZGVN5test12t2E = linkonce_odr global i64 0, align 8
+// CHECK: @_ZN5test12t1IiEE = linkonce_odr global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC2:[0-9]+]]
+// CHECK: @_ZN5test21AIvE8instanceE = weak_odr global %"struct.test2::A" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC3:[0-9]+]]
 // CHECK: @_ZGVN5test21AIvE8instanceE = weak_odr global i64 0, align 8
+// CHECK: @_ZN5test21AIiE8instanceE = global %"struct.test2::A.0" zeroinitializer, align {{[0-9]+}}, !associated ![[AS

[PATCH] D125203: [PowerPC] Fix PPCISD::STBRX selection issue on A2

2022-05-09 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 428294.
tingwang added a comment.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Update according to Nemanja's comment: add A2 to frontend isa-v206-instructions 
feature list, together with test case update.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125203/new/

https://reviews.llvm.org/D125203

Files:
  clang/lib/Basic/Targets/PPC.cpp
  clang/test/Driver/ppc-isa-features.cpp
  llvm/lib/Target/PowerPC/PPC.td
  llvm/test/CodeGen/PowerPC/bswap-load-store.ll

Index: llvm/test/CodeGen/PowerPC/bswap-load-store.ll
===
--- llvm/test/CodeGen/PowerPC/bswap-load-store.ll
+++ llvm/test/CodeGen/PowerPC/bswap-load-store.ll
@@ -3,6 +3,7 @@
 ; RUN: llc -ppc-asm-full-reg-names -verify-machineinstrs < %s -mtriple=ppc32-- -mcpu=pwr7  | FileCheck %s --check-prefixes=X32,PWR7_32
 ; RUN: llc -ppc-asm-full-reg-names -verify-machineinstrs < %s -mtriple=powerpc64-- -mcpu=ppc64 | FileCheck %s --check-prefixes=X64
 ; RUN: llc -ppc-asm-full-reg-names -verify-machineinstrs < %s -mtriple=powerpc64-- -mcpu=pwr7  | FileCheck %s --check-prefixes=PWR7_64
+; RUN: llc -ppc-asm-full-reg-names -verify-machineinstrs < %s -mtriple=powerpc64-- -mcpu=a2| FileCheck %s --check-prefixes=A2_64
 
 
 define void @STWBRX(i32 %i, i8* %ptr, i32 %off) {
@@ -22,6 +23,12 @@
 ; PWR7_64-NEXT:extsw r5, r5
 ; PWR7_64-NEXT:stwbrx r3, r4, r5
 ; PWR7_64-NEXT:blr
+;
+; A2_64-LABEL: STWBRX:
+; A2_64:   # %bb.0:
+; A2_64-NEXT:extsw r5, r5
+; A2_64-NEXT:stwbrx r3, r4, r5
+; A2_64-NEXT:blr
   %tmp1 = getelementptr i8, i8* %ptr, i32 %off
   %tmp1.upgrd.1 = bitcast i8* %tmp1 to i32*
   %tmp13 = tail call i32 @llvm.bswap.i32( i32 %i )
@@ -46,6 +53,12 @@
 ; PWR7_64-NEXT:extsw r4, r4
 ; PWR7_64-NEXT:lwbrx r3, r3, r4
 ; PWR7_64-NEXT:blr
+;
+; A2_64-LABEL: LWBRX:
+; A2_64:   # %bb.0:
+; A2_64-NEXT:extsw r4, r4
+; A2_64-NEXT:lwbrx r3, r3, r4
+; A2_64-NEXT:blr
   %tmp1 = getelementptr i8, i8* %ptr, i32 %off
   %tmp1.upgrd.2 = bitcast i8* %tmp1 to i32*
   %tmp = load i32, i32* %tmp1.upgrd.2
@@ -70,6 +83,12 @@
 ; PWR7_64-NEXT:extsw r5, r5
 ; PWR7_64-NEXT:sthbrx r3, r4, r5
 ; PWR7_64-NEXT:blr
+;
+; A2_64-LABEL: STHBRX:
+; A2_64:   # %bb.0:
+; A2_64-NEXT:extsw r5, r5
+; A2_64-NEXT:sthbrx r3, r4, r5
+; A2_64-NEXT:blr
   %tmp1 = getelementptr i8, i8* %ptr, i32 %off
   %tmp1.upgrd.3 = bitcast i8* %tmp1 to i16*
   %tmp5 = call i16 @llvm.bswap.i16( i16 %s )
@@ -94,6 +113,12 @@
 ; PWR7_64-NEXT:extsw r4, r4
 ; PWR7_64-NEXT:lhbrx r3, r3, r4
 ; PWR7_64-NEXT:blr
+;
+; A2_64-LABEL: LHBRX:
+; A2_64:   # %bb.0:
+; A2_64-NEXT:extsw r4, r4
+; A2_64-NEXT:lhbrx r3, r3, r4
+; A2_64-NEXT:blr
   %tmp1 = getelementptr i8, i8* %ptr, i32 %off
   %tmp1.upgrd.4 = bitcast i8* %tmp1 to i16*
   %tmp = load i16, i16* %tmp1.upgrd.4
@@ -133,6 +158,11 @@
 ; PWR7_64:   # %bb.0:
 ; PWR7_64-NEXT:stdbrx r3, r4, r5
 ; PWR7_64-NEXT:blr
+;
+; A2_64-LABEL: STDBRX:
+; A2_64:   # %bb.0:
+; A2_64-NEXT:stdbrx r3, r4, r5
+; A2_64-NEXT:blr
   %tmp1 = getelementptr i8, i8* %ptr, i64 %off
   %tmp1.upgrd.1 = bitcast i8* %tmp1 to i64*
   %tmp13 = tail call i64 @llvm.bswap.i64( i64 %i )
@@ -163,6 +193,11 @@
 ; PWR7_64:   # %bb.0:
 ; PWR7_64-NEXT:ldbrx r3, r3, r4
 ; PWR7_64-NEXT:blr
+;
+; A2_64-LABEL: LDBRX:
+; A2_64:   # %bb.0:
+; A2_64-NEXT:ldbrx r3, r3, r4
+; A2_64-NEXT:blr
   %tmp1 = getelementptr i8, i8* %ptr, i64 %off
   %tmp1.upgrd.2 = bitcast i8* %tmp1 to i64*
   %tmp = load i64, i64* %tmp1.upgrd.2
Index: llvm/lib/Target/PowerPC/PPC.td
===
--- llvm/lib/Target/PowerPC/PPC.td
+++ llvm/lib/Target/PowerPC/PPC.td
@@ -592,7 +592,8 @@
FeatureSTFIWX, FeatureLFIWAX,
FeatureFPRND, FeatureFPCVT, FeatureISEL,
FeatureSlowPOPCNTD, FeatureCMPB, FeatureLDBRX,
-   Feature64Bit /*, Feature64BitRegs */, FeatureMFTB]>;
+   Feature64Bit /*, Feature64BitRegs */, FeatureMFTB,
+   FeatureISA2_06]>;
 def : ProcessorModel<"pwr3", G5Model,
   [DirectivePwr3, FeatureAltivec,
FeatureFRES, FeatureFRSQRTE, FeatureMFOCRF,
Index: clang/test/Driver/ppc-isa-features.cpp
===
--- clang/test/Driver/ppc-isa-features.cpp
+++ clang/test/Driver/ppc-isa-features.cpp
@@ -1,4 +1,5 @@
 // RUN: %clang -target powerpc64-unknown-unknown -mcpu=pwr6 -S -emit-llvm %s -o - | FileCheck %s -check-prefix=CHECK-PWR6
+// RUN: %clang -target powerpc64-unknown-unknown -mcpu=a2 -S -emit-llvm %s -o - | FileCheck %s -check-prefix=CHECK-A2
 // RUN: %clang -target powerpc64-unknown-unknown -mcpu=pwr7 -S -emit-llvm %s -o - | FileCheck %s -check-prefix=CHECK-PWR7
 // RUN: %clang -ta

[PATCH] D125203: [PowerPC] Fix PPCISD::STBRX selection issue on A2

2022-05-09 Thread Ting Wang via Phabricator via cfe-commits
tingwang added a comment.

In D125203#3502433 , @nemanjai wrote:

> Why not also fix this in the front end so that we allow the builtin on the A2 
> CPU as well (since it's supported)?

Oh I missed that. Thank you for pointing out!

Just now updated the patch. However I didn't update the SemaFeatureCheck 
message to indicate the support on a2, since if people see this error message, 
it cannot be a2, and a2 does not easily fit into the message of 
err_ppc_builtin_only_on_arch. Hope this will not create problem.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125203/new/

https://reviews.llvm.org/D125203

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125203: [PowerPC] Fix PPCISD::STBRX selection issue on A2

2022-05-10 Thread Ting Wang via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG289236d597a2: [PowerPC] Fix PPCISD::STBRX selection issue on 
A2 (authored by tingwang).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125203/new/

https://reviews.llvm.org/D125203

Files:
  clang/lib/Basic/Targets/PPC.cpp
  clang/test/Driver/ppc-isa-features.cpp
  llvm/lib/Target/PowerPC/PPC.td
  llvm/test/CodeGen/PowerPC/bswap-load-store.ll

Index: llvm/test/CodeGen/PowerPC/bswap-load-store.ll
===
--- llvm/test/CodeGen/PowerPC/bswap-load-store.ll
+++ llvm/test/CodeGen/PowerPC/bswap-load-store.ll
@@ -3,6 +3,7 @@
 ; RUN: llc -ppc-asm-full-reg-names -verify-machineinstrs < %s -mtriple=ppc32-- -mcpu=pwr7  | FileCheck %s --check-prefixes=X32,PWR7_32
 ; RUN: llc -ppc-asm-full-reg-names -verify-machineinstrs < %s -mtriple=powerpc64-- -mcpu=ppc64 | FileCheck %s --check-prefixes=X64
 ; RUN: llc -ppc-asm-full-reg-names -verify-machineinstrs < %s -mtriple=powerpc64-- -mcpu=pwr7  | FileCheck %s --check-prefixes=PWR7_64
+; RUN: llc -ppc-asm-full-reg-names -verify-machineinstrs < %s -mtriple=powerpc64-- -mcpu=a2| FileCheck %s --check-prefixes=A2_64
 
 
 define void @STWBRX(i32 %i, i8* %ptr, i32 %off) {
@@ -22,6 +23,12 @@
 ; PWR7_64-NEXT:extsw r5, r5
 ; PWR7_64-NEXT:stwbrx r3, r4, r5
 ; PWR7_64-NEXT:blr
+;
+; A2_64-LABEL: STWBRX:
+; A2_64:   # %bb.0:
+; A2_64-NEXT:extsw r5, r5
+; A2_64-NEXT:stwbrx r3, r4, r5
+; A2_64-NEXT:blr
   %tmp1 = getelementptr i8, i8* %ptr, i32 %off
   %tmp1.upgrd.1 = bitcast i8* %tmp1 to i32*
   %tmp13 = tail call i32 @llvm.bswap.i32( i32 %i )
@@ -46,6 +53,12 @@
 ; PWR7_64-NEXT:extsw r4, r4
 ; PWR7_64-NEXT:lwbrx r3, r3, r4
 ; PWR7_64-NEXT:blr
+;
+; A2_64-LABEL: LWBRX:
+; A2_64:   # %bb.0:
+; A2_64-NEXT:extsw r4, r4
+; A2_64-NEXT:lwbrx r3, r3, r4
+; A2_64-NEXT:blr
   %tmp1 = getelementptr i8, i8* %ptr, i32 %off
   %tmp1.upgrd.2 = bitcast i8* %tmp1 to i32*
   %tmp = load i32, i32* %tmp1.upgrd.2
@@ -70,6 +83,12 @@
 ; PWR7_64-NEXT:extsw r5, r5
 ; PWR7_64-NEXT:sthbrx r3, r4, r5
 ; PWR7_64-NEXT:blr
+;
+; A2_64-LABEL: STHBRX:
+; A2_64:   # %bb.0:
+; A2_64-NEXT:extsw r5, r5
+; A2_64-NEXT:sthbrx r3, r4, r5
+; A2_64-NEXT:blr
   %tmp1 = getelementptr i8, i8* %ptr, i32 %off
   %tmp1.upgrd.3 = bitcast i8* %tmp1 to i16*
   %tmp5 = call i16 @llvm.bswap.i16( i16 %s )
@@ -94,6 +113,12 @@
 ; PWR7_64-NEXT:extsw r4, r4
 ; PWR7_64-NEXT:lhbrx r3, r3, r4
 ; PWR7_64-NEXT:blr
+;
+; A2_64-LABEL: LHBRX:
+; A2_64:   # %bb.0:
+; A2_64-NEXT:extsw r4, r4
+; A2_64-NEXT:lhbrx r3, r3, r4
+; A2_64-NEXT:blr
   %tmp1 = getelementptr i8, i8* %ptr, i32 %off
   %tmp1.upgrd.4 = bitcast i8* %tmp1 to i16*
   %tmp = load i16, i16* %tmp1.upgrd.4
@@ -133,6 +158,11 @@
 ; PWR7_64:   # %bb.0:
 ; PWR7_64-NEXT:stdbrx r3, r4, r5
 ; PWR7_64-NEXT:blr
+;
+; A2_64-LABEL: STDBRX:
+; A2_64:   # %bb.0:
+; A2_64-NEXT:stdbrx r3, r4, r5
+; A2_64-NEXT:blr
   %tmp1 = getelementptr i8, i8* %ptr, i64 %off
   %tmp1.upgrd.1 = bitcast i8* %tmp1 to i64*
   %tmp13 = tail call i64 @llvm.bswap.i64( i64 %i )
@@ -163,6 +193,11 @@
 ; PWR7_64:   # %bb.0:
 ; PWR7_64-NEXT:ldbrx r3, r3, r4
 ; PWR7_64-NEXT:blr
+;
+; A2_64-LABEL: LDBRX:
+; A2_64:   # %bb.0:
+; A2_64-NEXT:ldbrx r3, r3, r4
+; A2_64-NEXT:blr
   %tmp1 = getelementptr i8, i8* %ptr, i64 %off
   %tmp1.upgrd.2 = bitcast i8* %tmp1 to i64*
   %tmp = load i64, i64* %tmp1.upgrd.2
Index: llvm/lib/Target/PowerPC/PPC.td
===
--- llvm/lib/Target/PowerPC/PPC.td
+++ llvm/lib/Target/PowerPC/PPC.td
@@ -592,7 +592,8 @@
FeatureSTFIWX, FeatureLFIWAX,
FeatureFPRND, FeatureFPCVT, FeatureISEL,
FeatureSlowPOPCNTD, FeatureCMPB, FeatureLDBRX,
-   Feature64Bit /*, Feature64BitRegs */, FeatureMFTB]>;
+   Feature64Bit /*, Feature64BitRegs */, FeatureMFTB,
+   FeatureISA2_06]>;
 def : ProcessorModel<"pwr3", G5Model,
   [DirectivePwr3, FeatureAltivec,
FeatureFRES, FeatureFRSQRTE, FeatureMFOCRF,
Index: clang/test/Driver/ppc-isa-features.cpp
===
--- clang/test/Driver/ppc-isa-features.cpp
+++ clang/test/Driver/ppc-isa-features.cpp
@@ -1,4 +1,5 @@
 // RUN: %clang -target powerpc64-unknown-unknown -mcpu=pwr6 -S -emit-llvm %s -o - | FileCheck %s -check-prefix=CHECK-PWR6
+// RUN: %clang -target powerpc64-unknown-unknown -mcpu=a2 -S -emit-llvm %s -o - | FileCheck %s -check-prefix=CHECK-A2
 // RUN: %clang -target powerpc64-unknown-unknown -mcpu=pwr7 -S -emit-llvm %s -o - | FileCheck %s -check-prefix=CHECK-PWR7
 // RUN: %clang -target powerpc64le-unknown-unknown -m

[PATCH] D131953: [PowerPC][Coroutines] Add tail-call check with context information for coroutines

2022-08-16 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 453188.
tingwang added a comment.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Update according to comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131953/new/

https://reviews.llvm.org/D131953

Files:
  clang/test/CodeGenCoroutines/pr56329.cpp
  llvm/include/llvm/Analysis/TargetTransformInfo.h
  llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
  llvm/lib/Analysis/TargetTransformInfo.cpp
  llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
  llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h
  llvm/lib/Transforms/Coroutines/CoroSplit.cpp
  llvm/test/Transforms/Coroutines/coro-split-musttail-ppc64le.ll

Index: llvm/test/Transforms/Coroutines/coro-split-musttail-ppc64le.ll
===
--- /dev/null
+++ llvm/test/Transforms/Coroutines/coro-split-musttail-ppc64le.ll
@@ -0,0 +1,74 @@
+; Tests that some target (e.g. ppc) can support tail call under condition.
+; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse' -S \
+; RUN: -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr9 | FileCheck %s
+; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse' -S \
+; RUN: -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr10 --code-model=medium \
+; RUN: | FileCheck %s --check-prefix=CHECK-PCREL
+
+define void @f() #0 {
+entry:
+  %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
+  %alloc = call i8* @malloc(i64 16) #3
+  %vFrame = call noalias nonnull i8* @llvm.coro.begin(token %id, i8* %alloc)
+
+  %save = call token @llvm.coro.save(i8* null)
+  %addr1 = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+  %pv1 = bitcast i8* %addr1 to void (i8*)*
+  call fastcc void %pv1(i8* null)
+
+  %suspend = call i8 @llvm.coro.suspend(token %save, i1 false)
+  switch i8 %suspend, label %exit [
+i8 0, label %await.ready
+i8 1, label %exit
+  ]
+await.ready:
+  %save2 = call token @llvm.coro.save(i8* null)
+  %addr2 = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+  %pv2 = bitcast i8* %addr2 to void (i8*)*
+  call fastcc void %pv2(i8* null)
+
+  %suspend2 = call i8 @llvm.coro.suspend(token %save2, i1 false)
+  switch i8 %suspend2, label %exit [
+i8 0, label %exit
+i8 1, label %exit
+  ]
+exit:
+  call i1 @llvm.coro.end(i8* null, i1 false)
+  ret void
+}
+
+; Verify that in the initial function resume is not marked with musttail.
+; CHECK-LABEL: @f(
+; CHECK: %[[addr1:.+]] = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+; CHECK-NEXT: %[[pv1:.+]] = bitcast i8* %[[addr1]] to void (i8*)*
+; CHECK-NOT: musttail call fastcc void %[[pv1]](i8* null)
+
+; Verify that ppc target not using PC-Relative addressing in the resume part resume call is not marked with musttail.
+; CHECK-LABEL: @f.resume(
+; CHECK: %[[addr2:.+]] = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+; CHECK-NEXT: %[[pv2:.+]] = bitcast i8* %[[addr2]] to void (i8*)*
+; CHECK-NEXT: call fastcc void %[[pv2]](i8* null)
+
+; Verify that ppc target using PC-Relative addressing in the resume part resume call is marked with musttail.
+; CHECK-PCREL-LABEL: @f.resume(
+; CHECK-PCREL: %[[addr2:.+]] = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+; CHECK-PCREL-NEXT: %[[pv2:.+]] = bitcast i8* %[[addr2]] to void (i8*)*
+; CHECK-PCREL-NEXT: musttail call fastcc void %[[pv2]](i8* null)
+; CHECK-PCREL-NEXT: ret void
+
+declare token @llvm.coro.id(i32, i8* readnone, i8* nocapture readonly, i8*) #1
+declare i1 @llvm.coro.alloc(token) #2
+declare i64 @llvm.coro.size.i64() #3
+declare i8* @llvm.coro.begin(token, i8* writeonly) #2
+declare token @llvm.coro.save(i8*) #2
+declare i8* @llvm.coro.frame() #3
+declare i8 @llvm.coro.suspend(token, i1) #2
+declare i8* @llvm.coro.free(token, i8* nocapture readonly) #1
+declare i1 @llvm.coro.end(i8*, i1) #2
+declare i8* @llvm.coro.subfn.addr(i8* nocapture readonly, i8) #1
+declare i8* @malloc(i64)
+
+attributes #0 = { presplitcoroutine }
+attributes #1 = { argmemonly nounwind readonly }
+attributes #2 = { nounwind }
+attributes #3 = { nounwind readnone }
Index: llvm/lib/Transforms/Coroutines/CoroSplit.cpp
===
--- llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -1362,7 +1362,7 @@
 // for symmetrical coroutine control transfer (C++ Coroutines TS extension).
 // This transformation is done only in the resume part of the coroutine that has
 // identical signature and calling convention as the coro.resume call.
-static void addMustTailToCoroResumes(Function &F) {
+static void addMustTailToCoroResumes(Function &F, TargetTransformInfo &TTI) {
   bool changed = false;
 
   // Collect potential resume instructions.
@@ -1374,7 +1374,9 @@
 
   // Set musttail on those that are followed by a ret instruction.
   for (CallInst *Call : Resumes)
-if (simplifyTerminatorLeadingToRet(Call->getNextNode())) {
+// Skip t

[PATCH] D131953: [PowerPC][Coroutines] Add tail-call check with context information for coroutines

2022-08-16 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 453193.
tingwang added a comment.

For default implementation of `supportsTailCallFor`, return 
`supportsTailCalls()` as suggested in comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131953/new/

https://reviews.llvm.org/D131953

Files:
  clang/test/CodeGenCoroutines/pr56329.cpp
  llvm/include/llvm/Analysis/TargetTransformInfo.h
  llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
  llvm/lib/Analysis/TargetTransformInfo.cpp
  llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
  llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h
  llvm/lib/Transforms/Coroutines/CoroSplit.cpp
  llvm/test/Transforms/Coroutines/coro-split-musttail-ppc64le.ll

Index: llvm/test/Transforms/Coroutines/coro-split-musttail-ppc64le.ll
===
--- /dev/null
+++ llvm/test/Transforms/Coroutines/coro-split-musttail-ppc64le.ll
@@ -0,0 +1,74 @@
+; Tests that some target (e.g. ppc) can support tail call under condition.
+; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse' -S \
+; RUN: -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr9 | FileCheck %s
+; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse' -S \
+; RUN: -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr10 --code-model=medium \
+; RUN: | FileCheck %s --check-prefix=CHECK-PCREL
+
+define void @f() #0 {
+entry:
+  %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
+  %alloc = call i8* @malloc(i64 16) #3
+  %vFrame = call noalias nonnull i8* @llvm.coro.begin(token %id, i8* %alloc)
+
+  %save = call token @llvm.coro.save(i8* null)
+  %addr1 = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+  %pv1 = bitcast i8* %addr1 to void (i8*)*
+  call fastcc void %pv1(i8* null)
+
+  %suspend = call i8 @llvm.coro.suspend(token %save, i1 false)
+  switch i8 %suspend, label %exit [
+i8 0, label %await.ready
+i8 1, label %exit
+  ]
+await.ready:
+  %save2 = call token @llvm.coro.save(i8* null)
+  %addr2 = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+  %pv2 = bitcast i8* %addr2 to void (i8*)*
+  call fastcc void %pv2(i8* null)
+
+  %suspend2 = call i8 @llvm.coro.suspend(token %save2, i1 false)
+  switch i8 %suspend2, label %exit [
+i8 0, label %exit
+i8 1, label %exit
+  ]
+exit:
+  call i1 @llvm.coro.end(i8* null, i1 false)
+  ret void
+}
+
+; Verify that in the initial function resume is not marked with musttail.
+; CHECK-LABEL: @f(
+; CHECK: %[[addr1:.+]] = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+; CHECK-NEXT: %[[pv1:.+]] = bitcast i8* %[[addr1]] to void (i8*)*
+; CHECK-NOT: musttail call fastcc void %[[pv1]](i8* null)
+
+; Verify that ppc target not using PC-Relative addressing in the resume part resume call is not marked with musttail.
+; CHECK-LABEL: @f.resume(
+; CHECK: %[[addr2:.+]] = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+; CHECK-NEXT: %[[pv2:.+]] = bitcast i8* %[[addr2]] to void (i8*)*
+; CHECK-NEXT: call fastcc void %[[pv2]](i8* null)
+
+; Verify that ppc target using PC-Relative addressing in the resume part resume call is marked with musttail.
+; CHECK-PCREL-LABEL: @f.resume(
+; CHECK-PCREL: %[[addr2:.+]] = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+; CHECK-PCREL-NEXT: %[[pv2:.+]] = bitcast i8* %[[addr2]] to void (i8*)*
+; CHECK-PCREL-NEXT: musttail call fastcc void %[[pv2]](i8* null)
+; CHECK-PCREL-NEXT: ret void
+
+declare token @llvm.coro.id(i32, i8* readnone, i8* nocapture readonly, i8*) #1
+declare i1 @llvm.coro.alloc(token) #2
+declare i64 @llvm.coro.size.i64() #3
+declare i8* @llvm.coro.begin(token, i8* writeonly) #2
+declare token @llvm.coro.save(i8*) #2
+declare i8* @llvm.coro.frame() #3
+declare i8 @llvm.coro.suspend(token, i1) #2
+declare i8* @llvm.coro.free(token, i8* nocapture readonly) #1
+declare i1 @llvm.coro.end(i8*, i1) #2
+declare i8* @llvm.coro.subfn.addr(i8* nocapture readonly, i8) #1
+declare i8* @malloc(i64)
+
+attributes #0 = { presplitcoroutine }
+attributes #1 = { argmemonly nounwind readonly }
+attributes #2 = { nounwind }
+attributes #3 = { nounwind readnone }
Index: llvm/lib/Transforms/Coroutines/CoroSplit.cpp
===
--- llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -1362,7 +1362,7 @@
 // for symmetrical coroutine control transfer (C++ Coroutines TS extension).
 // This transformation is done only in the resume part of the coroutine that has
 // identical signature and calling convention as the coro.resume call.
-static void addMustTailToCoroResumes(Function &F) {
+static void addMustTailToCoroResumes(Function &F, TargetTransformInfo &TTI) {
   bool changed = false;
 
   // Collect potential resume instructions.
@@ -1374,7 +1374,9 @@
 
   // Set musttail on those that are followed by a ret instruction.
   for (CallInst *Call : Resumes)
-if (simplifyTerminatorLeadingToRet(Call->getNextNode())) {
+/

[PATCH] D131953: [PowerPC][Coroutines] Add tail-call check with context information for coroutines

2022-08-16 Thread Ting Wang via Phabricator via cfe-commits
tingwang marked an inline comment as done.
tingwang added a comment.

In D131953#3727900 , @ChuanqiXu wrote:

> LGTM with comment addressed. Thanks!

Thank you! I'm looking forward to comments from ppc.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131953/new/

https://reviews.llvm.org/D131953

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D131953: [PowerPC][Coroutines] Add tail-call check with context information for coroutines

2022-08-21 Thread Ting Wang via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGd2d77e050b32: [PowerPC][Coroutines] Add tail-call check with 
call information for coroutines (authored by tingwang).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131953/new/

https://reviews.llvm.org/D131953

Files:
  clang/test/CodeGenCoroutines/pr56329.cpp
  llvm/include/llvm/Analysis/TargetTransformInfo.h
  llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
  llvm/lib/Analysis/TargetTransformInfo.cpp
  llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
  llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h
  llvm/lib/Transforms/Coroutines/CoroSplit.cpp
  llvm/test/Transforms/Coroutines/coro-split-musttail-ppc64le.ll

Index: llvm/test/Transforms/Coroutines/coro-split-musttail-ppc64le.ll
===
--- /dev/null
+++ llvm/test/Transforms/Coroutines/coro-split-musttail-ppc64le.ll
@@ -0,0 +1,74 @@
+; Tests that some target (e.g. ppc) can support tail call under condition.
+; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse' -S \
+; RUN: -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr9 | FileCheck %s
+; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse' -S \
+; RUN: -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr10 --code-model=medium \
+; RUN: | FileCheck %s --check-prefix=CHECK-PCREL
+
+define void @f() #0 {
+entry:
+  %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
+  %alloc = call i8* @malloc(i64 16) #3
+  %vFrame = call noalias nonnull i8* @llvm.coro.begin(token %id, i8* %alloc)
+
+  %save = call token @llvm.coro.save(i8* null)
+  %addr1 = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+  %pv1 = bitcast i8* %addr1 to void (i8*)*
+  call fastcc void %pv1(i8* null)
+
+  %suspend = call i8 @llvm.coro.suspend(token %save, i1 false)
+  switch i8 %suspend, label %exit [
+i8 0, label %await.ready
+i8 1, label %exit
+  ]
+await.ready:
+  %save2 = call token @llvm.coro.save(i8* null)
+  %addr2 = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+  %pv2 = bitcast i8* %addr2 to void (i8*)*
+  call fastcc void %pv2(i8* null)
+
+  %suspend2 = call i8 @llvm.coro.suspend(token %save2, i1 false)
+  switch i8 %suspend2, label %exit [
+i8 0, label %exit
+i8 1, label %exit
+  ]
+exit:
+  call i1 @llvm.coro.end(i8* null, i1 false)
+  ret void
+}
+
+; Verify that in the initial function resume is not marked with musttail.
+; CHECK-LABEL: @f(
+; CHECK: %[[addr1:.+]] = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+; CHECK-NEXT: %[[pv1:.+]] = bitcast i8* %[[addr1]] to void (i8*)*
+; CHECK-NOT: musttail call fastcc void %[[pv1]](i8* null)
+
+; Verify that ppc target not using PC-Relative addressing in the resume part resume call is not marked with musttail.
+; CHECK-LABEL: @f.resume(
+; CHECK: %[[addr2:.+]] = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+; CHECK-NEXT: %[[pv2:.+]] = bitcast i8* %[[addr2]] to void (i8*)*
+; CHECK-NEXT: call fastcc void %[[pv2]](i8* null)
+
+; Verify that ppc target using PC-Relative addressing in the resume part resume call is marked with musttail.
+; CHECK-PCREL-LABEL: @f.resume(
+; CHECK-PCREL: %[[addr2:.+]] = call i8* @llvm.coro.subfn.addr(i8* null, i8 0)
+; CHECK-PCREL-NEXT: %[[pv2:.+]] = bitcast i8* %[[addr2]] to void (i8*)*
+; CHECK-PCREL-NEXT: musttail call fastcc void %[[pv2]](i8* null)
+; CHECK-PCREL-NEXT: ret void
+
+declare token @llvm.coro.id(i32, i8* readnone, i8* nocapture readonly, i8*) #1
+declare i1 @llvm.coro.alloc(token) #2
+declare i64 @llvm.coro.size.i64() #3
+declare i8* @llvm.coro.begin(token, i8* writeonly) #2
+declare token @llvm.coro.save(i8*) #2
+declare i8* @llvm.coro.frame() #3
+declare i8 @llvm.coro.suspend(token, i1) #2
+declare i8* @llvm.coro.free(token, i8* nocapture readonly) #1
+declare i1 @llvm.coro.end(i8*, i1) #2
+declare i8* @llvm.coro.subfn.addr(i8* nocapture readonly, i8) #1
+declare i8* @malloc(i64)
+
+attributes #0 = { presplitcoroutine }
+attributes #1 = { argmemonly nounwind readonly }
+attributes #2 = { nounwind }
+attributes #3 = { nounwind readnone }
Index: llvm/lib/Transforms/Coroutines/CoroSplit.cpp
===
--- llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -1362,7 +1362,7 @@
 // for symmetrical coroutine control transfer (C++ Coroutines TS extension).
 // This transformation is done only in the resume part of the coroutine that has
 // identical signature and calling convention as the coro.resume call.
-static void addMustTailToCoroResumes(Function &F) {
+static void addMustTailToCoroResumes(Function &F, TargetTransformInfo &TTI) {
   bool changed = false;
 
   // Collect potential resume instructions.
@@ -1374,7 +1374,9 @@
 
   // Set musttail on those that are followed by a ret instruction.
   for (CallInst *Call : Resumes)
-if (simplifyTerminatorLeadingToRet(Cal

[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-09-22 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 462113.
tingwang added a comment.

Add TODO comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGen/PowerPC/ppc64-align-struct.c


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -129,15 +129,15 @@
   return y;
 }
 
-// Error pattern will be fixed in https://reviews.llvm.org/D18
 // CHECK: define{{.*}} void @test8va(%struct.test8* noalias 
sret(%struct.test8) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load i8*, i8** %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, i8* %[[CUR]], i64 8
 // CHECK: store i8* %[[NEXT]], i8** %ap
-// CHECK: [[T0:%.*]] = bitcast i8* %[[CUR]] to %struct.test8*
+// CHECK: [[T0:%.*]] = getelementptr inbounds i8, i8* %[[CUR]], i64 7
+// CHECK: [[T1:%.*]] = bitcast i8* [[T0]] to %struct.test8*
 // CHECK: [[DEST:%.*]] = bitcast %struct.test8* %[[AGG_RESULT]] to i8*
-// CHECK: [[SRC:%.*]] = bitcast %struct.test8* [[T0]] to i8*
-// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 [[DEST]], i8* align 
8 [[SRC]], i64 1, i1 false)
+// CHECK: [[SRC:%.*]] = bitcast %struct.test8* [[T1]] to i8*
+// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 [[DEST]], i8* align 
1 [[SRC]], i64 1, i1 false)
 struct test8 test8va (int x, ...)
 {
   struct test8 y;
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -322,13 +322,19 @@
 ///   leaving one or more empty slots behind as padding.  If this
 ///   is false, the returned address might be less-aligned than
 ///   DirectAlign.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
+///   TODO: this is workaround. Should use same logic for caller and callee
+///   to deduce the adjustment, and get rid of this flag.
 static Address emitVoidPtrDirectVAArg(CodeGenFunction &CGF,
   Address VAListAddr,
   llvm::Type *DirectTy,
   CharUnits DirectSize,
   CharUnits DirectAlign,
   CharUnits SlotSize,
-  bool AllowHigherAlign) {
+  bool AllowHigherAlign,
+  bool ForceRightAdjust = false) {
   // Cast the element type to i8* if necessary.  Some platforms define
   // va_list as a struct containing an i8* instead of just an i8*.
   if (VAListAddr.getElementType() != CGF.Int8PtrTy)
@@ -354,7 +360,7 @@
   // If the argument is smaller than a slot, and this is a big-endian
   // target, the argument will be right-adjusted in its slot.
   if (DirectSize < SlotSize && CGF.CGM.getDataLayout().isBigEndian() &&
-  !DirectTy->isStructTy()) {
+  (!DirectTy->isStructTy() || ForceRightAdjust)) {
 Addr = CGF.Builder.CreateConstInBoundsByteGEP(Addr, SlotSize - DirectSize);
   }
 
@@ -375,11 +381,15 @@
 ///   an argument type with an alignment greater than the slot size
 ///   will be emitted on a higher-alignment address, potentially
 ///   leaving one or more empty slots behind as padding.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrVAArg(CodeGenFunction &CGF, Address VAListAddr,
 QualType ValueTy, bool IsIndirect,
 TypeInfoChars ValueInfo,
 CharUnits SlotSizeAndAlign,
-bool AllowHigherAlign) {
+bool AllowHigherAlign,
+bool ForceRightAdjust = false) {
   // The size and alignment of the value that was passed directly.
   CharUnits DirectSize, DirectAlign;
   if (IsIndirect) {
@@ -395,9 +405,9 @@
   if (IsIndirect)
 DirectTy = DirectTy->getPointerTo(0);
 
-  Address Addr =
-  emitVoidPtrDirectVAArg(CGF, VAListAddr, DirectTy, DirectSize, 
DirectAlign,
- SlotSizeAndAlign, AllowHigherAlign);
+  Address Addr = emitVoidPtrDirectVAArg(CGF, VAListAddr, DirectTy, DirectSize,
+DirectAlign, SlotSizeAndAlign,
+AllowHigherAlign, ForceRightAdjust);
 
   if (Is

[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-09-22 Thread Ting Wang via Phabricator via cfe-commits
tingwang added a comment.

In D18#3786406 , @nemanjai wrote:

> I am not crazy about adding the Boolean parameter here or about the name. 
> Seems somewhat unclear when a caller wants to pass `true` there.
>
> What I think would be a more robust solution would be to use the same logic 
> that decides whether to coerce the struct argument to an integer type. It 
> seems that any big endian ABI that does this would want to ensure the access 
> is on the right side.
>
> Ultimately what I am getting at here is that we consider how the caller 
> passes the value and how the callee accesses it separately - which is what 
> leads to problems like this. Can we decide using the same function for the 
> caller and the callee?

I think this requirement is nontrivial for me right now, so I added TODO 
comments. Can we take this as a workaround for the issue #55900?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133488: [clang][PowerPC][NFC] Add base test case for PPC64 VAArg aggregate smaller than a slot

2022-10-10 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 466708.
tingwang added a comment.

Rebase on opaque-pointers test case changes && Gentle ping.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133488/new/

https://reviews.llvm.org/D133488

Files:
  clang/test/CodeGen/PowerPC/ppc64-align-struct.c


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -9,6 +9,7 @@
 struct test5 { int x[17]; };
 struct test6 { int x[17]; } __attribute__((aligned (16)));
 struct test7 { int x[17]; } __attribute__((aligned (32)));
+struct test8 { char x; };
 
 // CHECK: define{{.*}} void @test1(i32 noundef signext %x, i64 %y.coerce)
 void test1 (int x, struct test1 y)
@@ -116,6 +117,22 @@
   return y;
 }
 
+// Error pattern will be fixed in https://reviews.llvm.org/D18
+// CHECK: define{{.*}} void @test8va(ptr noalias sret(%struct.test8) align 1 
%[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
+// CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
+// CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
+// CHECK: store ptr %[[NEXT]], ptr %ap
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr 
align 8 %[[CUR]], i64 1, i1 false)
+struct test8 test8va (int x, ...)
+{
+  struct test8 y;
+  va_list ap;
+  va_start(ap, x);
+  y = va_arg (ap, struct test8);
+  va_end(ap);
+  return y;
+}
+
 // CHECK: define{{.*}} void @testva_longdouble(ptr noalias 
sret(%struct.test_longdouble) align 16 %[[AGG_RESULT:.*]], i32 noundef signext 
%x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 16


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -9,6 +9,7 @@
 struct test5 { int x[17]; };
 struct test6 { int x[17]; } __attribute__((aligned (16)));
 struct test7 { int x[17]; } __attribute__((aligned (32)));
+struct test8 { char x; };
 
 // CHECK: define{{.*}} void @test1(i32 noundef signext %x, i64 %y.coerce)
 void test1 (int x, struct test1 y)
@@ -116,6 +117,22 @@
   return y;
 }
 
+// Error pattern will be fixed in https://reviews.llvm.org/D18
+// CHECK: define{{.*}} void @test8va(ptr noalias sret(%struct.test8) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
+// CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
+// CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
+// CHECK: store ptr %[[NEXT]], ptr %ap
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr align 8 %[[CUR]], i64 1, i1 false)
+struct test8 test8va (int x, ...)
+{
+  struct test8 y;
+  va_list ap;
+  va_start(ap, x);
+  y = va_arg (ap, struct test8);
+  va_end(ap);
+  return y;
+}
+
 // CHECK: define{{.*}} void @testva_longdouble(ptr noalias sret(%struct.test_longdouble) align 16 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 16
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-10-10 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 466710.
tingwang added a comment.

Rebase && Gentle ping.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGen/PowerPC/ppc64-align-struct.c


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -117,12 +117,12 @@
   return y;
 }
 
-// Error pattern will be fixed in https://reviews.llvm.org/D18
 // CHECK: define{{.*}} void @test8va(ptr noalias sret(%struct.test8) align 1 
%[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
 // CHECK: store ptr %[[NEXT]], ptr %ap
-// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr 
align 8 %[[CUR]], i64 1, i1 false)
+// CHECK: [[T0:%.*]] = getelementptr inbounds i8, ptr %[[CUR]], i64 7
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr 
align 1 [[T0]], i64 1, i1 false)
 struct test8 test8va (int x, ...)
 {
   struct test8 y;
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -322,13 +322,19 @@
 ///   leaving one or more empty slots behind as padding.  If this
 ///   is false, the returned address might be less-aligned than
 ///   DirectAlign.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
+///   TODO: this is workaround. Should use same logic for caller and callee
+///   to deduce the adjustment, and get rid of this flag.
 static Address emitVoidPtrDirectVAArg(CodeGenFunction &CGF,
   Address VAListAddr,
   llvm::Type *DirectTy,
   CharUnits DirectSize,
   CharUnits DirectAlign,
   CharUnits SlotSize,
-  bool AllowHigherAlign) {
+  bool AllowHigherAlign,
+  bool ForceRightAdjust = false) {
   // Cast the element type to i8* if necessary.  Some platforms define
   // va_list as a struct containing an i8* instead of just an i8*.
   if (VAListAddr.getElementType() != CGF.Int8PtrTy)
@@ -354,7 +360,7 @@
   // If the argument is smaller than a slot, and this is a big-endian
   // target, the argument will be right-adjusted in its slot.
   if (DirectSize < SlotSize && CGF.CGM.getDataLayout().isBigEndian() &&
-  !DirectTy->isStructTy()) {
+  (!DirectTy->isStructTy() || ForceRightAdjust)) {
 Addr = CGF.Builder.CreateConstInBoundsByteGEP(Addr, SlotSize - DirectSize);
   }
 
@@ -375,11 +381,15 @@
 ///   an argument type with an alignment greater than the slot size
 ///   will be emitted on a higher-alignment address, potentially
 ///   leaving one or more empty slots behind as padding.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrVAArg(CodeGenFunction &CGF, Address VAListAddr,
 QualType ValueTy, bool IsIndirect,
 TypeInfoChars ValueInfo,
 CharUnits SlotSizeAndAlign,
-bool AllowHigherAlign) {
+bool AllowHigherAlign,
+bool ForceRightAdjust = false) {
   // The size and alignment of the value that was passed directly.
   CharUnits DirectSize, DirectAlign;
   if (IsIndirect) {
@@ -395,9 +405,9 @@
   if (IsIndirect)
 DirectTy = DirectTy->getPointerTo(0);
 
-  Address Addr =
-  emitVoidPtrDirectVAArg(CGF, VAListAddr, DirectTy, DirectSize, 
DirectAlign,
- SlotSizeAndAlign, AllowHigherAlign);
+  Address Addr = emitVoidPtrDirectVAArg(CGF, VAListAddr, DirectTy, DirectSize,
+DirectAlign, SlotSizeAndAlign,
+AllowHigherAlign, ForceRightAdjust);
 
   if (IsIndirect) {
 Addr = Address(CGF.Builder.CreateLoad(Addr), ElementTy, ValueInfo.Align);
@@ -5451,8 +5461,9 @@
   }
 
   // Otherwise, just use the general rule.
-  return emitVoidPtrVAArg(CGF, VAListAddr, Ty, /*Indirect*/ false,
-  TypeInfo, SlotSize, /*AllowHigher*/ true);
+  return emitVoidPtr

[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-10-11 Thread Ting Wang via Phabricator via cfe-commits
tingwang added a comment.

In D18#3850091 , @rjmccall wrote:

> This seems be to a genuine target difference, right?  PPC64 has this ABI rule:
>
>> An aggregate or union smaller than one doubleword in size is padded so that 
>> it appears in the least significant bits of the doubleword.
>
> which overrides the standard rule for passing aggregates:
>
>> Fixed size aggregates and unions passed by value are mapped to as many 
>> doublewords of the parameter save area as the value uses in memory. 
>> Aggregrates and unions are aligned according to their alignment 
>> requirements. This may result in doublewords being skipped for alignment.
>
> Other big-endian targets don't do this to non-fundamental types as far as I 
> can tell.  So I don't really get the TODO in the comment; this does seem to 
> be something that ABIs need to pass in.

Thank you. I realized the TODO applies to the specific PPC64 case, not the API. 
I will move that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-10-11 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 466992.
tingwang added a comment.

Address comment: move TODO close to PPC64 specific case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGen/PowerPC/ppc64-align-struct.c


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -117,12 +117,12 @@
   return y;
 }
 
-// Error pattern will be fixed in https://reviews.llvm.org/D18
 // CHECK: define{{.*}} void @test8va(ptr noalias sret(%struct.test8) align 1 
%[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
 // CHECK: store ptr %[[NEXT]], ptr %ap
-// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr 
align 8 %[[CUR]], i64 1, i1 false)
+// CHECK: [[T0:%.*]] = getelementptr inbounds i8, ptr %[[CUR]], i64 7
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr 
align 1 [[T0]], i64 1, i1 false)
 struct test8 test8va (int x, ...)
 {
   struct test8 y;
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -322,13 +322,17 @@
 ///   leaving one or more empty slots behind as padding.  If this
 ///   is false, the returned address might be less-aligned than
 ///   DirectAlign.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrDirectVAArg(CodeGenFunction &CGF,
   Address VAListAddr,
   llvm::Type *DirectTy,
   CharUnits DirectSize,
   CharUnits DirectAlign,
   CharUnits SlotSize,
-  bool AllowHigherAlign) {
+  bool AllowHigherAlign,
+  bool ForceRightAdjust = false) {
   // Cast the element type to i8* if necessary.  Some platforms define
   // va_list as a struct containing an i8* instead of just an i8*.
   if (VAListAddr.getElementType() != CGF.Int8PtrTy)
@@ -354,7 +358,7 @@
   // If the argument is smaller than a slot, and this is a big-endian
   // target, the argument will be right-adjusted in its slot.
   if (DirectSize < SlotSize && CGF.CGM.getDataLayout().isBigEndian() &&
-  !DirectTy->isStructTy()) {
+  (!DirectTy->isStructTy() || ForceRightAdjust)) {
 Addr = CGF.Builder.CreateConstInBoundsByteGEP(Addr, SlotSize - DirectSize);
   }
 
@@ -375,11 +379,15 @@
 ///   an argument type with an alignment greater than the slot size
 ///   will be emitted on a higher-alignment address, potentially
 ///   leaving one or more empty slots behind as padding.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrVAArg(CodeGenFunction &CGF, Address VAListAddr,
 QualType ValueTy, bool IsIndirect,
 TypeInfoChars ValueInfo,
 CharUnits SlotSizeAndAlign,
-bool AllowHigherAlign) {
+bool AllowHigherAlign,
+bool ForceRightAdjust = false) {
   // The size and alignment of the value that was passed directly.
   CharUnits DirectSize, DirectAlign;
   if (IsIndirect) {
@@ -395,9 +403,9 @@
   if (IsIndirect)
 DirectTy = DirectTy->getPointerTo(0);
 
-  Address Addr =
-  emitVoidPtrDirectVAArg(CGF, VAListAddr, DirectTy, DirectSize, 
DirectAlign,
- SlotSizeAndAlign, AllowHigherAlign);
+  Address Addr = emitVoidPtrDirectVAArg(CGF, VAListAddr, DirectTy, DirectSize,
+DirectAlign, SlotSizeAndAlign,
+AllowHigherAlign, ForceRightAdjust);
 
   if (IsIndirect) {
 Addr = Address(CGF.Builder.CreateLoad(Addr), ElementTy, ValueInfo.Align);
@@ -5451,8 +5459,11 @@
   }
 
   // Otherwise, just use the general rule.
-  return emitVoidPtrVAArg(CGF, VAListAddr, Ty, /*Indirect*/ false,
-  TypeInfo, SlotSize, /*AllowHigher*/ true);
+  // TODO: a better approach may refer to SystemZABI use same logic for caller
+  // and callee to deduce the adjustment,

[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-10-12 Thread Ting Wang via Phabricator via cfe-commits
tingwang added inline comments.



Comment at: clang/lib/CodeGen/TargetInfo.cpp:327
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrDirectVAArg(CodeGenFunction &CGF,

rjmccall wrote:
> Argh, Phabricator dropped one of my comments, and it's the one that explains 
> why I CC'ed Tim Northover.
> 
> I'm a little worried about the existing uses of this function because this 
> function is sensitive to the type produced by `ConvertTypeForMem`.  
> `ConvertTypeForMem` *mostly* only generates IR struct types for C structs and 
> unions, but there are a few places where it generates an IR struct for some 
> fundamental type that stores multiple values.  Most of those types are at 
> least as large as an argument slot (e.g. they contain pointers), unless 
> there's some weird target with huge slots.  However, some of them are not; I 
> think the most important example is `_Complex T`, which of course gets 
> translated into a struct containing two `T`s.  So if `T` is smaller than half 
> an argument slot, we're not going to right-align `_Complex T` on big-endian 
> targets other than PPC64, and I don't know if that's right.
> 
> That would affect `_Complex _Float16` on 64-bit targets; on 32-bit targets, I 
> think you'd need something obscure like `_Complex char` to exercise it.
> 
> Now, if Clang generates arguments for one of these types using a single value 
> that's also of IR struct type, and the backend considers that when deciding 
> whether to right-align arguments, then maybe those two decisions cancel out 
> and we've at least got call/va_arg compatibility, even if it's not 
> necessarily what's formally specified by the appropriate psABI.  But 
> `DirectTy` is definitely not necessarily the type that call-argument lowering 
> will use, so I'm a little worried.
Thank you!

I checked the `_Complex char` case on PPC64: complex element size smaller than 
argument slot is handled by `complexTempStructure()` 
(https://github.com/llvm/llvm-project/blob/51d33afcbe0a81bb8508d5685f38dc9fdb2b60c9/clang/lib/CodeGen/TargetInfo.cpp#L5451),
 and the right-adjustment is taken care by that logic. Both AIX and PPC64 use 
`complexTempStructure()` to produce variadic callee arguments in this case.

In case `_Complex char` is encapsulated inside structure, then the whole is 
considered as an aggregate, and is addressed by this fix. I will add a test 
case to illustrate.

Hope these addressed your concern.




Comment at: clang/lib/CodeGen/TargetInfo.cpp:5461
 
   // Otherwise, just use the general rule.
+  // TODO: a better approach may refer to SystemZABI use same logic for caller

rjmccall wrote:
> Please add this comment explaining the use of ForceRightAdjust:
> 
> > The PPC64 ABI passes some arguments in integer registers, even to variadic 
> > functions.  To allow `va_list` to use the simple "`void*`" representation, 
> > variadic calls allocate space in the argument area for the integer argument 
> > registers, and variadic functions spill their integer argument registers to 
> > this area in their prologues.  When aggregates smaller than a register are 
> > passed this way, they are passed in the least significant bits of the 
> > register, which means that after spilling on big-endian targets they will 
> > be right-aligned in their argument slot.  This is uncommon; for a variety 
> > of reasons, other big-endian targets don't end up right-aligning aggregate 
> > types this way, and so right-alignment only applies to fundamental types.  
> > So on PPC64, we must force the use of right-alignment even for aggregates.
> 
> I'm not sure what your TODO is hoping for.  You'd like to re-use logic 
> between the frontend's va_arg emission and the backend's variadic argument 
> emission?  That would be very tricky.
Sure, I will add the comment. Thank you.

Maybe I misunderstood some previous comment. I will drop the TODO.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133488: [clang][PowerPC][NFC] Add base test case for PPC64 VAArg aggregate smaller than a slot

2022-10-12 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 467040.
tingwang added a reviewer: rjmccall.
tingwang added a comment.

Add test case according to comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133488/new/

https://reviews.llvm.org/D133488

Files:
  clang/test/CodeGen/PowerPC/ppc64-align-struct.c


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -9,6 +9,8 @@
 struct test5 { int x[17]; };
 struct test6 { int x[17]; } __attribute__((aligned (16)));
 struct test7 { int x[17]; } __attribute__((aligned (32)));
+struct test8 { char x; };
+struct test9 { _Complex char x; };
 
 // CHECK: define{{.*}} void @test1(i32 noundef signext %x, i64 %y.coerce)
 void test1 (int x, struct test1 y)
@@ -48,6 +50,16 @@
 {
 }
 
+// CHECK: define{{.*}} void @test8(i32 noundef signext %x, i8 %y.coerce)
+void test8 (int x, struct test8 y)
+{
+}
+
+// CHECK: define{{.*}} void @test9(i32 noundef signext %x, i16 %y.coerce)
+void test9 (int x, struct test9 y)
+{
+}
+
 // CHECK: define{{.*}} void @test1va(ptr noalias sret(%struct.test1) align 4 
%[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
@@ -116,6 +128,38 @@
   return y;
 }
 
+// Error pattern will be fixed in https://reviews.llvm.org/D18
+// CHECK: define{{.*}} void @test8va(ptr noalias sret(%struct.test8) align 1 
%[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
+// CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
+// CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
+// CHECK: store ptr %[[NEXT]], ptr %ap
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr 
align 8 %[[CUR]], i64 1, i1 false)
+struct test8 test8va (int x, ...)
+{
+  struct test8 y;
+  va_list ap;
+  va_start(ap, x);
+  y = va_arg (ap, struct test8);
+  va_end(ap);
+  return y;
+}
+
+// Error pattern will be fixed in https://reviews.llvm.org/D18
+// CHECK: define{{.*}} void @test9va(ptr noalias sret(%struct.test9) align 1 
%[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
+// CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
+// CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
+// CHECK: store ptr %[[NEXT]], ptr %ap
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr 
align 8 %[[CUR]], i64 2, i1 false)
+struct test9 test9va (int x, ...)
+{
+  struct test9 y;
+  va_list ap;
+  va_start(ap, x);
+  y = va_arg (ap, struct test9);
+  va_end(ap);
+  return y;
+}
+
 // CHECK: define{{.*}} void @testva_longdouble(ptr noalias 
sret(%struct.test_longdouble) align 16 %[[AGG_RESULT:.*]], i32 noundef signext 
%x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 16


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -9,6 +9,8 @@
 struct test5 { int x[17]; };
 struct test6 { int x[17]; } __attribute__((aligned (16)));
 struct test7 { int x[17]; } __attribute__((aligned (32)));
+struct test8 { char x; };
+struct test9 { _Complex char x; };
 
 // CHECK: define{{.*}} void @test1(i32 noundef signext %x, i64 %y.coerce)
 void test1 (int x, struct test1 y)
@@ -48,6 +50,16 @@
 {
 }
 
+// CHECK: define{{.*}} void @test8(i32 noundef signext %x, i8 %y.coerce)
+void test8 (int x, struct test8 y)
+{
+}
+
+// CHECK: define{{.*}} void @test9(i32 noundef signext %x, i16 %y.coerce)
+void test9 (int x, struct test9 y)
+{
+}
+
 // CHECK: define{{.*}} void @test1va(ptr noalias sret(%struct.test1) align 4 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
@@ -116,6 +128,38 @@
   return y;
 }
 
+// Error pattern will be fixed in https://reviews.llvm.org/D18
+// CHECK: define{{.*}} void @test8va(ptr noalias sret(%struct.test8) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
+// CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
+// CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
+// CHECK: store ptr %[[NEXT]], ptr %ap
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr align 8 %[[CUR]], i64 1, i1 false)
+struct test8 test8va (int x, ...)
+{
+  struct test8 y;
+  va_list ap;
+  va_start(ap, x);
+  y = va_arg (ap, struct test8);
+  va_end(ap);
+  return y;
+}
+
+// Error pattern will be fixed in https://reviews.llvm.org/D18
+// CHECK: define{{.*}} void @test9va(ptr noalias sret(%struct.test9) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
+// CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
+// 

[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-10-12 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 467048.
tingwang added a comment.

Update according to comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGen/PowerPC/ppc64-align-struct.c

Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -128,12 +128,12 @@
   return y;
 }
 
-// Error pattern will be fixed in https://reviews.llvm.org/D18
 // CHECK: define{{.*}} void @test8va(ptr noalias sret(%struct.test8) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
 // CHECK: store ptr %[[NEXT]], ptr %ap
-// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr align 8 %[[CUR]], i64 1, i1 false)
+// CHECK: [[T0:%.*]] = getelementptr inbounds i8, ptr %[[CUR]], i64 7
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr align 1 [[T0]], i64 1, i1 false)
 struct test8 test8va (int x, ...)
 {
   struct test8 y;
@@ -144,12 +144,12 @@
   return y;
 }
 
-// Error pattern will be fixed in https://reviews.llvm.org/D18
 // CHECK: define{{.*}} void @test9va(ptr noalias sret(%struct.test9) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
 // CHECK: store ptr %[[NEXT]], ptr %ap
-// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr align 8 %[[CUR]], i64 2, i1 false)
+// CHECK: [[T0:%.*]] = getelementptr inbounds i8, ptr %[[CUR]], i64 6
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr align 2 [[T0]], i64 2, i1 false)
 struct test9 test9va (int x, ...)
 {
   struct test9 y;
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -322,13 +322,17 @@
 ///   leaving one or more empty slots behind as padding.  If this
 ///   is false, the returned address might be less-aligned than
 ///   DirectAlign.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrDirectVAArg(CodeGenFunction &CGF,
   Address VAListAddr,
   llvm::Type *DirectTy,
   CharUnits DirectSize,
   CharUnits DirectAlign,
   CharUnits SlotSize,
-  bool AllowHigherAlign) {
+  bool AllowHigherAlign,
+  bool ForceRightAdjust = false) {
   // Cast the element type to i8* if necessary.  Some platforms define
   // va_list as a struct containing an i8* instead of just an i8*.
   if (VAListAddr.getElementType() != CGF.Int8PtrTy)
@@ -354,7 +358,7 @@
   // If the argument is smaller than a slot, and this is a big-endian
   // target, the argument will be right-adjusted in its slot.
   if (DirectSize < SlotSize && CGF.CGM.getDataLayout().isBigEndian() &&
-  !DirectTy->isStructTy()) {
+  (!DirectTy->isStructTy() || ForceRightAdjust)) {
 Addr = CGF.Builder.CreateConstInBoundsByteGEP(Addr, SlotSize - DirectSize);
   }
 
@@ -375,11 +379,15 @@
 ///   an argument type with an alignment greater than the slot size
 ///   will be emitted on a higher-alignment address, potentially
 ///   leaving one or more empty slots behind as padding.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrVAArg(CodeGenFunction &CGF, Address VAListAddr,
 QualType ValueTy, bool IsIndirect,
 TypeInfoChars ValueInfo,
 CharUnits SlotSizeAndAlign,
-bool AllowHigherAlign) {
+bool AllowHigherAlign,
+bool ForceRightAdjust = false) {
   // The size and alignment of the value that was passed directly.
   CharUnits DirectSize, DirectAlign;
   if (IsIndirect) {
@@ -395,9 +403,9 @@
   if (IsIndirect)
 DirectTy = DirectTy->getPointerTo(0);
 
-  Address Addr =
-  emitVoidPtrDirectVAArg(CGF, VAListAddr, DirectTy, DirectSize, DirectAlign,
-  

[PATCH] D133488: [clang][PowerPC][NFC] Add base test case for PPC64 VAArg aggregate smaller than a slot

2022-10-13 Thread Ting Wang via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG00b9bed1f05a: [clang][PowerPC][NFC] Add base test case for 
PPC64 VAArg aggregate smaller than… (authored by tingwang).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133488/new/

https://reviews.llvm.org/D133488

Files:
  clang/test/CodeGen/PowerPC/ppc64-align-struct.c


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -9,6 +9,8 @@
 struct test5 { int x[17]; };
 struct test6 { int x[17]; } __attribute__((aligned (16)));
 struct test7 { int x[17]; } __attribute__((aligned (32)));
+struct test8 { char x; };
+struct test9 { _Complex char x; };
 
 // CHECK: define{{.*}} void @test1(i32 noundef signext %x, i64 %y.coerce)
 void test1 (int x, struct test1 y)
@@ -48,6 +50,16 @@
 {
 }
 
+// CHECK: define{{.*}} void @test8(i32 noundef signext %x, i8 %y.coerce)
+void test8 (int x, struct test8 y)
+{
+}
+
+// CHECK: define{{.*}} void @test9(i32 noundef signext %x, i16 %y.coerce)
+void test9 (int x, struct test9 y)
+{
+}
+
 // CHECK: define{{.*}} void @test1va(ptr noalias sret(%struct.test1) align 4 
%[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
@@ -116,6 +128,38 @@
   return y;
 }
 
+// Error pattern will be fixed in https://reviews.llvm.org/D18
+// CHECK: define{{.*}} void @test8va(ptr noalias sret(%struct.test8) align 1 
%[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
+// CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
+// CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
+// CHECK: store ptr %[[NEXT]], ptr %ap
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr 
align 8 %[[CUR]], i64 1, i1 false)
+struct test8 test8va (int x, ...)
+{
+  struct test8 y;
+  va_list ap;
+  va_start(ap, x);
+  y = va_arg (ap, struct test8);
+  va_end(ap);
+  return y;
+}
+
+// Error pattern will be fixed in https://reviews.llvm.org/D18
+// CHECK: define{{.*}} void @test9va(ptr noalias sret(%struct.test9) align 1 
%[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
+// CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
+// CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
+// CHECK: store ptr %[[NEXT]], ptr %ap
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr 
align 8 %[[CUR]], i64 2, i1 false)
+struct test9 test9va (int x, ...)
+{
+  struct test9 y;
+  va_list ap;
+  va_start(ap, x);
+  y = va_arg (ap, struct test9);
+  va_end(ap);
+  return y;
+}
+
 // CHECK: define{{.*}} void @testva_longdouble(ptr noalias 
sret(%struct.test_longdouble) align 16 %[[AGG_RESULT:.*]], i32 noundef signext 
%x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 16


Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -9,6 +9,8 @@
 struct test5 { int x[17]; };
 struct test6 { int x[17]; } __attribute__((aligned (16)));
 struct test7 { int x[17]; } __attribute__((aligned (32)));
+struct test8 { char x; };
+struct test9 { _Complex char x; };
 
 // CHECK: define{{.*}} void @test1(i32 noundef signext %x, i64 %y.coerce)
 void test1 (int x, struct test1 y)
@@ -48,6 +50,16 @@
 {
 }
 
+// CHECK: define{{.*}} void @test8(i32 noundef signext %x, i8 %y.coerce)
+void test8 (int x, struct test8 y)
+{
+}
+
+// CHECK: define{{.*}} void @test9(i32 noundef signext %x, i16 %y.coerce)
+void test9 (int x, struct test9 y)
+{
+}
+
 // CHECK: define{{.*}} void @test1va(ptr noalias sret(%struct.test1) align 4 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
@@ -116,6 +128,38 @@
   return y;
 }
 
+// Error pattern will be fixed in https://reviews.llvm.org/D18
+// CHECK: define{{.*}} void @test8va(ptr noalias sret(%struct.test8) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
+// CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
+// CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
+// CHECK: store ptr %[[NEXT]], ptr %ap
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr align 8 %[[CUR]], i64 1, i1 false)
+struct test8 test8va (int x, ...)
+{
+  struct test8 y;
+  va_list ap;
+  va_start(ap, x);
+  y = va_arg (ap, struct test8);
+  va_end(ap);
+  return y;
+}
+
+// Error pattern will be fixed in https://reviews.llvm.org/D18
+// CHECK: define{{.*}} void @test9va(ptr noalias sret(%struct.test9) align 1 %[[AGG_RESULT:.*]], i32 noundef si

[PATCH] D133338: [clang][PowerPC] PPC64 VAArg use coerced integer type for direct aggregate fits in register

2022-10-16 Thread Ting Wang via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGee703b5cb134: [clang][PowerPC] PPC64 VAArg fix 
right-alignment for aggregates fit in register (authored by tingwang).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D18/new/

https://reviews.llvm.org/D18

Files:
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGen/PowerPC/ppc64-align-struct.c

Index: clang/test/CodeGen/PowerPC/ppc64-align-struct.c
===
--- clang/test/CodeGen/PowerPC/ppc64-align-struct.c
+++ clang/test/CodeGen/PowerPC/ppc64-align-struct.c
@@ -128,12 +128,12 @@
   return y;
 }
 
-// Error pattern will be fixed in https://reviews.llvm.org/D18
 // CHECK: define{{.*}} void @test8va(ptr noalias sret(%struct.test8) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
 // CHECK: store ptr %[[NEXT]], ptr %ap
-// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr align 8 %[[CUR]], i64 1, i1 false)
+// CHECK: [[T0:%.*]] = getelementptr inbounds i8, ptr %[[CUR]], i64 7
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr align 1 [[T0]], i64 1, i1 false)
 struct test8 test8va (int x, ...)
 {
   struct test8 y;
@@ -144,12 +144,12 @@
   return y;
 }
 
-// Error pattern will be fixed in https://reviews.llvm.org/D18
 // CHECK: define{{.*}} void @test9va(ptr noalias sret(%struct.test9) align 1 %[[AGG_RESULT:.*]], i32 noundef signext %x, ...)
 // CHECK: %[[CUR:[^ ]+]] = load ptr, ptr %ap
 // CHECK: %[[NEXT:[^ ]+]] = getelementptr inbounds i8, ptr %[[CUR]], i64 8
 // CHECK: store ptr %[[NEXT]], ptr %ap
-// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr align 8 %[[CUR]], i64 2, i1 false)
+// CHECK: [[T0:%.*]] = getelementptr inbounds i8, ptr %[[CUR]], i64 6
+// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %[[AGG_RESULT]], ptr align 2 [[T0]], i64 2, i1 false)
 struct test9 test9va (int x, ...)
 {
   struct test9 y;
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -322,13 +322,17 @@
 ///   leaving one or more empty slots behind as padding.  If this
 ///   is false, the returned address might be less-aligned than
 ///   DirectAlign.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrDirectVAArg(CodeGenFunction &CGF,
   Address VAListAddr,
   llvm::Type *DirectTy,
   CharUnits DirectSize,
   CharUnits DirectAlign,
   CharUnits SlotSize,
-  bool AllowHigherAlign) {
+  bool AllowHigherAlign,
+  bool ForceRightAdjust = false) {
   // Cast the element type to i8* if necessary.  Some platforms define
   // va_list as a struct containing an i8* instead of just an i8*.
   if (VAListAddr.getElementType() != CGF.Int8PtrTy)
@@ -354,7 +358,7 @@
   // If the argument is smaller than a slot, and this is a big-endian
   // target, the argument will be right-adjusted in its slot.
   if (DirectSize < SlotSize && CGF.CGM.getDataLayout().isBigEndian() &&
-  !DirectTy->isStructTy()) {
+  (!DirectTy->isStructTy() || ForceRightAdjust)) {
 Addr = CGF.Builder.CreateConstInBoundsByteGEP(Addr, SlotSize - DirectSize);
   }
 
@@ -375,11 +379,15 @@
 ///   an argument type with an alignment greater than the slot size
 ///   will be emitted on a higher-alignment address, potentially
 ///   leaving one or more empty slots behind as padding.
+/// \param ForceRightAdjust - Default is false. On big-endian platform and
+///   if the argument is smaller than a slot, set this flag will force
+///   right-adjust the argument in its slot irrespective of the type.
 static Address emitVoidPtrVAArg(CodeGenFunction &CGF, Address VAListAddr,
 QualType ValueTy, bool IsIndirect,
 TypeInfoChars ValueInfo,
 CharUnits SlotSizeAndAlign,
-bool AllowHigherAlign) {
+bool AllowHigherAlign,
+bool ForceRightAdjust = false) {
   // The size and alignment of the value that was passed directly.
   CharUnits DirectSize, DirectAlign;
   if (IsIndirect) {
@@ -395,9 +403,9 @@
   if (IsIndirect)

[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2022-05-31 Thread Ting Wang via Phabricator via cfe-commits
tingwang added a comment.

Gentle ping.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2022-06-05 Thread Ting Wang via Phabricator via cfe-commits
tingwang added a comment.

In D125095#3552451 , @shchenz wrote:

> Thanks for doing this. I am not familiar with the frontend, so I may be 
> wrong/stupid in the follow comments : )
> Hope other experts like @hubert.reinterpretcast can give more meaningful 
> comments.
>
> I tested on AIX, seems for static variable `static int x = foo();` in global 
> scope, even compile with `-bcdtors:csect`, the init function also will not be 
> eliminated. Could you please give an example to show why we need the new 
> associated metadata for this case? Thanks.

Here is one example to show:

TEST_FOLDER=/tmp/test
mkdir -p $TEST_FOLDER
cd $TEST_FOLDER
cat > libbar.cc < libbaz.cc <
struct A {

  ~A() { puts("struct A ~A() 2"); }
  static A instance;

};

template  A A::instance;
void *zap() { return &A<>::instance; }

EOF

cat > uselib.cc <

int main(void) {

  void *handle = dlopen("./libbaz.so", RTLD_NOW | RTLD_LOCAL);
  dlclose(handle);

}
EOF

g++
===

XLC=g++
$XLC -fno-exceptions -c libbar.cc
rm -f libbar.a
ar qs libbar.a libbar.o
ranlib libbar.a
$XLC -fno-exceptions -c libbaz.cc
$XLC -shared -o libbaz.so libbaz.o libbar.a
$XLC -fno-exceptions -ldl uselib.cc -o uselib -ldl
./uselib 
foo
struct A ~A() 2

XLC 16.1.0
==

XLC=/gsa/rtpgsa/projects/x/xlcmpbld/run/vacpp/16.1.0/aix/daily/191109/bin/xlclang++
$XLC -g -qnoeh -qfuncsect -c libbar.cc
rm -f libbar.a
ar qs libbar.a libbar.o
ranlib libbar.a
$XLC -g -qnoeh -qfuncsect -c libbaz.cc
$XLC -g -qtwolink -G -o libbaz.so libbaz.o libbar.a
$XLC -g -qnoeh uselib.cc -o uselib
./uselib 
foo
struct A ~A() 2

clang++ baseline


XLC=/home/tingwa/repo/llvm-project-base/dev/build/bin/clang++
$XLC -g -fignore-exceptions -ffunction-sections -c libbar.cc
rm -f libbar.a
ar qs libbar.a libbar.o
ranlib libbar.a
$XLC -g -fignore-exceptions -ffunction-sections -c libbaz.cc
$XLC -g -bcdtors:csect -shared -Wl,-G -o libbaz.so libbaz.o libbar.a
$XLC -g -fignore-exceptions uselib.cc -o uselib
./uselib
struct A ~A() 2

clang++ .ref


XLC=/home/tingwa/repo/llvm-project-12514-BE/dev/build/bin/clang++
$XLC -g -fignore-exceptions -ffunction-sections -c libbar.cc
rm -f libbar.a
ar qs libbar.a libbar.o
ranlib libbar.a
$XLC -g -fignore-exceptions -ffunction-sections -c libbaz.cc
$XLC -g -bcdtors:csect -shared -Wl,-G -o libbaz.so libbaz.o libbar.a
$XLC -g -fignore-exceptions uselib.cc -o uselib
./uselib 
foo
struct A ~A() 2

As shown in this example: without .ref association, clang++ baseline case will 
be wrong, since globalVar is updated in the init function.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2022-06-08 Thread Ting Wang via Phabricator via cfe-commits
tingwang marked an inline comment as done.
tingwang added inline comments.



Comment at: clang/lib/CodeGen/CGDeclCXX.cpp:688
+updateAssociatedFunc(VFInitTermAssoc, LocalCXXGlobalInits, GetElem, 
Fn);
+updateAssociatedFunc(FFDtorTermAssoc, LocalCXXGlobalInits, GetElem, 
Fn);
+  }

shchenz wrote:
> `FFDtorTermAssoc` should store the mapping between dtor and term functions? 
> So why we need to update this container when we generate wrapper function for 
> init function? I think in the init function there should only be ctors 
> related functions?
> 
> And why we don't need to update for `VarsWithInitTerm`, in that container 
> there should be some static variables reply on the wrapper init function?
> `FFDtorTermAssoc` should store the mapping between dtor and term functions? 
> So why we need to update this container when we generate wrapper function for 
> init function? I think in the init function there should only be ctors 
> related functions?

Thank you for pointing out! This is redundant.

> And why we don't need to update for `VarsWithInitTerm`, in that container 
> there should be some static variables reply on the wrapper init function?

VarsWithInitTerm keeps track of mapping between variables in clang (Decl*) and 
the corresponding data structure in llvm (Constant *). To me it's stable, and 
not like functions which could be wrapped in new functions.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4799
+if (getTriple().isOSAIX())
+  addVarWithInitTerm(D, GV);
+  }

shchenz wrote:
> Why do we need to add mapping between a variable and its address? We already 
> map the global and its init function in above `EmitCXXGlobalVarDeclInitFunc`?
It seems to me that clang most of the time operates on Decl*. However to 
generate metadata, we refer to llvm::Constant*. I did not find how to get 
llvm::Constant* from Decl* in clang, so I'm tracking that information. I will 
check again to see if there is any official way to do that but I'm not aware of.



Comment at: clang/lib/CodeGen/CodeGenModule.h:465
+  /// between dtor and term functions.
+  llvm::SmallVector, 8>
+  VFInitTermAssoc;

shchenz wrote:
> Is there any reason why we need `vector` here instead of `map`? Can you give 
> an example that shows one global variable will be connected with more than 
> one init functions?
One variable can have two functions associated: one init and one term, thus 
used vector for VFInitTermAssoc. Also it is better to use variable as key for 
the benefit of the inner for loop inside AddMeta.

Dtor-to-Term (FFDtorTermAssoc) could use map, however it shares similar update 
logic as VFInitTermAssoc (for example the code snippet AddMeta in 
CodeGenModule::genAssocMeta() is used on both data structure), so I prefer to 
use vector for both of them.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2022-06-08 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 435107.
tingwang added a comment.
Herald added subscribers: llvm-commits, jdoerfert.

Update according to comments:
(1) Update docs
(2) Use `addVarTermAssoc`
(3) Remove redundant call to `updateAssociatedFunc(FFDtorTermAssoc...`
(4) Remove unnecessary `llvm::array_pod_sort` on `FFDtorTermAssoc`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

Files:
  clang/lib/CodeGen/CGDecl.cpp
  clang/lib/CodeGen/CGDeclCXX.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp
  clang/test/CodeGen/PowerPC/aix-ref-static-var.cpp
  clang/test/CodeGen/PowerPC/aix-ref-tls_init.cpp
  clang/test/CodeGenCXX/aix-static-init-debug-info.cpp
  clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
  clang/test/CodeGenCXX/aix-static-init.cpp
  llvm/docs/LangRef.rst

Index: llvm/docs/LangRef.rst
===
--- llvm/docs/LangRef.rst
+++ llvm/docs/LangRef.rst
@@ -7040,6 +7040,10 @@
 @b = internal global i32 2, comdat $a, section "abc", !associated !0
 !0 = !{i32* @a}
 
+On XCOFF target, the ``associated`` metadata indicates connection among static
+variables (static global variable, static class member etc.) and static init/
+term functions. This metadata lowers to ``.ref`` assembler pseudo-operation
+which prevents discarding of the functions in linker GC.
 
 '``prof``' Metadata
 ^^^
Index: clang/test/CodeGenCXX/aix-static-init.cpp
===
--- clang/test/CodeGenCXX/aix-static-init.cpp
+++ clang/test/CodeGenCXX/aix-static-init.cpp
@@ -38,6 +38,10 @@
   }
 } // namespace test4
 
+// CHECK: @_ZN5test12t1E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test21xE = global i32 0, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
+// CHECK: @_ZN5test31tE = global %"struct.test3::Test3" undef, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
 // CHECK: @_ZGVZN5test41fEvE11staticLocal = internal global i64 0, align 8
 // CHECK: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__sub_I__, i8* null }]
 // CHECK: @llvm.global_dtors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__D_a, i8* null }]
@@ -49,7 +53,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(%"struct.test1::Test1"* @_ZN5test12t1E)
 // CHECK:   ret void
@@ -80,7 +84,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(%"struct.test1::Test1"* @_ZN5test12t2E)
 // CHECK:   ret void
@@ -114,7 +118,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test35Test3D1Ev(%"struct.test3::Test3"* @_ZN5test31tE)
 // CHECK:   ret void
@@ -155,7 +159,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test45Test4D1Ev(%"struct.test4::Test4"* @_ZZN5test41fEvE11staticLocal)
 // CHECK:   ret void
@@ -192,3 +196,7 @@
 // CHECK:   call void @__finalize__ZN5test12t1E()
 // CHECK:   ret void
 // CHECK: }
+
+// CHECK: ![[ASSOC0]] = !{void ()* @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}, void ()* @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}}
+// CHECK: ![[ASSOC1]] = !{void ()* @_GLOBAL__sub_I__}
+// CHECK: ![[ASSOC2]] = !{void ()* @_GLOBAL__D_a}
Index: clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
===
--- clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
+++ clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
@@ -44,8 +44,13 @@
 A A::instance = bar();
 } // namespace test2
 
+// CHECK: @_ZN5test12t0E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+//

[PATCH] D145767: [Verifier][NFC] Refactor check for associated metadata to allow multiple operands on AIX

2023-03-14 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 505372.
tingwang added a comment.

Address comments:
(1) Duplicated associated test cases to make copies for AIX XCOFF, and added 
`multiple operands` and `null operand` as legal cases.
(2) Updated verifier logic to check for AIX target.
(3) Moved XFAIL check to specific lines.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145767/new/

https://reviews.llvm.org/D145767

Files:
  clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp
  clang/test/CodeGen/PowerPC/aix-ref-static-var.cpp
  clang/test/CodeGenCXX/aix-static-init-debug-info.cpp
  clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
  clang/test/CodeGenCXX/aix-static-init.cpp
  llvm/docs/LangRef.rst
  llvm/lib/IR/Verifier.cpp
  llvm/test/Assembler/associated-metadata-aix-xcoff.ll
  llvm/test/Linker/Inputs/associated-global-aix-xcoff.ll
  llvm/test/Linker/associated-global-aix-xcoff.ll
  llvm/test/Verifier/associated-metadata-aix-xcoff.ll

Index: llvm/test/Verifier/associated-metadata-aix-xcoff.ll
===
--- /dev/null
+++ llvm/test/Verifier/associated-metadata-aix-xcoff.ll
@@ -0,0 +1,31 @@
+; RUN: not llvm-as -disable-output < %s -o /dev/null 2>&1 | FileCheck %s
+
+target triple = "unknown-unknown-aix-xcoff"
+
+; CHECK: associated value must be pointer typed
+; CHECK-NEXT: ptr addrspace(1) @associated.int
+; CHECK-NEXT: !0 = !{i32 1}
+@associated.int = external addrspace(1) constant [8 x i8], !associated !0
+
+; CHECK: associated value must be pointer typed
+; CHECK-NEXT: ptr addrspace(1) @associated.float
+; CHECK-NEXT: !1 = !{float 1.00e+00}
+@associated.float = external addrspace(1) constant [8 x i8], !associated !1
+
+; CHECK: global values should not associate to themselves
+; CHECK-NEXT: ptr @associated.self
+; CHECK-NEXT: !2 = !{ptr @associated.self}
+@associated.self = external constant [8 x i8], !associated !2
+
+; CHECK: associated metadata must be ValueAsMetadata
+; CHECK-NEXT: ptr @associated.string
+; CHECK-NEXT: !3 = !{!"string"}
+@associated.string = external constant [8 x i8], !associated !3
+
+@gv.decl0 = external constant [8 x i8]
+@gv.decl1 = external constant [8 x i8]
+
+!0 = !{i32 1}
+!1 = !{float 1.00e+00}
+!2 = !{ptr @associated.self}
+!3 = !{!"string"}
Index: llvm/test/Linker/associated-global-aix-xcoff.ll
===
--- /dev/null
+++ llvm/test/Linker/associated-global-aix-xcoff.ll
@@ -0,0 +1,33 @@
+; RUN: llvm-link -S %s %S/Inputs/associated-global-aix-xcoff.ll | FileCheck %s
+
+target triple = "unknown-unknown-aix-xcoff"
+
+; CHECK: @c = internal global i32 1, !associated !0
+; CHECK: @d = global i32 0, !associated !1
+; CHECK: @f = global i32 0, !associated !2
+; CHECK: @a = global i32 0, !associated !3
+; CHECK: @b = global i32 0, !associated !4
+; CHECK: @c.3 = internal global i32 1, !associated !5
+; CHECK: @e = global i32 0, !associated !6
+; CHECK: @g = global i32 0, !associated !7
+
+; CHECK: !0 = !{ptr @d}
+; CHECK: !1 = !{ptr @c}
+; CHECK: !2 = !{ptr @a, ptr @b}
+; CHECK: !3 = !{ptr @b}
+; CHECK: !4 = !{ptr @a}
+; CHECK: !5 = !{ptr @e}
+; CHECK: !6 = !{ptr @c.3}
+; CHECK: !7 = distinct !{null}
+
+@a = external global i32, !associated !0
+@b = global i32 0, !associated !1
+@c = internal global i32 1, !associated !2
+@d = global i32 0, !associated !3
+@f = global i32 0, !associated !4
+
+!0 = !{ptr @b}
+!1 = !{ptr @a}
+!2 = !{ptr @d}
+!3 = !{ptr @c}
+!4 = !{ptr @a, ptr @b}
Index: llvm/test/Linker/Inputs/associated-global-aix-xcoff.ll
===
--- /dev/null
+++ llvm/test/Linker/Inputs/associated-global-aix-xcoff.ll
@@ -0,0 +1,13 @@
+target triple = "unknown-unknown-aix-xcoff"
+
+@a = global i32 0, !associated !0
+@b = external global i32, !associated !1
+@c = internal global i32 1, !associated !2
+@e = global i32 0, !associated !3
+@g = global i32 0, !associated !4
+
+!0 = !{ptr @b}
+!1 = !{ptr @a}
+!2 = !{ptr @e}
+!3 = !{ptr @c}
+!4 = distinct !{null}
Index: llvm/test/Assembler/associated-metadata-aix-xcoff.ll
===
--- /dev/null
+++ llvm/test/Assembler/associated-metadata-aix-xcoff.ll
@@ -0,0 +1,103 @@
+; RUN: llvm-as < %s | llvm-dis | FileCheck %s
+
+target triple = "unknown-unknown-aix-xcoff"
+
+@gv.decl = external constant [8 x i8]
+@gv.def = constant [8 x i8] zeroinitializer
+
+@gv.associated.func.decl = external addrspace(1) constant [8 x i8], !associated !0
+@gv.associated.func.def = external addrspace(1) constant [8 x i8], !associated !1
+
+@gv.associated.gv.decl = external addrspace(1) constant [8 x i8], !associated !2
+@gv.associated.gv.def = external addrspace(1) constant [8 x i8], !associated !3
+
+@alias = alias i32, ptr @gv.def
+
+@gv.associated.alias.gv.def = external addrspace(1) constant [8 x i8], !associated !4
+
+@gv.associated.alias.addrspacecast = external add

[PATCH] D145767: [Verifier][NFC] Refactor check for associated metadata to allow multiple operands on AIX

2023-03-14 Thread Ting Wang via Phabricator via cfe-commits
tingwang added inline comments.



Comment at: clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp:2
+// RUN: %clang_cc1 -triple powerpc64-ibm-aix-xcoff -emit-llvm -O3 -x c++ < %s 
| FileCheck %s
+// XFAIL: *
+// This function should fail until .ref is support on AIX.

arsenm wrote:
> A test run with not and a specific error message check is more reliable 
Thank you Matt! Currently there is no error message for this case, so I moved 
the XFAIL to CHECK lines. Hope that is fine.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145767/new/

https://reviews.llvm.org/D145767

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D145767: [Verifier][NFC] Refactor check for associated metadata to allow multiple operands on AIX

2023-03-16 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 505709.
tingwang added a comment.

Add empty case in Verifier/associated-metadata-aix-xcoff.ll


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145767/new/

https://reviews.llvm.org/D145767

Files:
  clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp
  clang/test/CodeGen/PowerPC/aix-ref-static-var.cpp
  clang/test/CodeGenCXX/aix-static-init-debug-info.cpp
  clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
  clang/test/CodeGenCXX/aix-static-init.cpp
  llvm/docs/LangRef.rst
  llvm/lib/IR/Verifier.cpp
  llvm/test/Assembler/associated-metadata-aix-xcoff.ll
  llvm/test/Linker/Inputs/associated-global-aix-xcoff.ll
  llvm/test/Linker/associated-global-aix-xcoff.ll
  llvm/test/Verifier/associated-metadata-aix-xcoff.ll

Index: llvm/test/Verifier/associated-metadata-aix-xcoff.ll
===
--- /dev/null
+++ llvm/test/Verifier/associated-metadata-aix-xcoff.ll
@@ -0,0 +1,37 @@
+; RUN: not llvm-as -disable-output < %s -o /dev/null 2>&1 | FileCheck %s
+
+target triple = "unknown-unknown-aix-xcoff"
+
+; CHECK: associated value must be pointer typed
+; CHECK-NEXT: ptr addrspace(1) @associated.int
+; CHECK-NEXT: !0 = !{i32 1}
+@associated.int = external addrspace(1) constant [8 x i8], !associated !0
+
+; CHECK: associated value must be pointer typed
+; CHECK-NEXT: ptr addrspace(1) @associated.float
+; CHECK-NEXT: !1 = !{float 1.00e+00}
+@associated.float = external addrspace(1) constant [8 x i8], !associated !1
+
+; CHECK: associated metadata must have one operand
+; CHECK-NEXT: ptr addrspace(1) @associated.empty
+; CHECK-NEXT: !2 = !{}
+@associated.empty = external addrspace(1) constant [8 x i8], !associated !2
+
+; CHECK: global values should not associate to themselves
+; CHECK-NEXT: ptr @associated.self
+; CHECK-NEXT: !3 = !{ptr @associated.self}
+@associated.self = external constant [8 x i8], !associated !3
+
+; CHECK: associated metadata must be ValueAsMetadata
+; CHECK-NEXT: ptr @associated.string
+; CHECK-NEXT: !4 = !{!"string"}
+@associated.string = external constant [8 x i8], !associated !4
+
+@gv.decl0 = external constant [8 x i8]
+@gv.decl1 = external constant [8 x i8]
+
+!0 = !{i32 1}
+!1 = !{float 1.00e+00}
+!2 = !{}
+!3 = !{ptr @associated.self}
+!4 = !{!"string"}
Index: llvm/test/Linker/associated-global-aix-xcoff.ll
===
--- /dev/null
+++ llvm/test/Linker/associated-global-aix-xcoff.ll
@@ -0,0 +1,33 @@
+; RUN: llvm-link -S %s %S/Inputs/associated-global-aix-xcoff.ll | FileCheck %s
+
+target triple = "unknown-unknown-aix-xcoff"
+
+; CHECK: @c = internal global i32 1, !associated !0
+; CHECK: @d = global i32 0, !associated !1
+; CHECK: @f = global i32 0, !associated !2
+; CHECK: @a = global i32 0, !associated !3
+; CHECK: @b = global i32 0, !associated !4
+; CHECK: @c.3 = internal global i32 1, !associated !5
+; CHECK: @e = global i32 0, !associated !6
+; CHECK: @g = global i32 0, !associated !7
+
+; CHECK: !0 = !{ptr @d}
+; CHECK: !1 = !{ptr @c}
+; CHECK: !2 = !{ptr @a, ptr @b}
+; CHECK: !3 = !{ptr @b}
+; CHECK: !4 = !{ptr @a}
+; CHECK: !5 = !{ptr @e}
+; CHECK: !6 = !{ptr @c.3}
+; CHECK: !7 = distinct !{null}
+
+@a = external global i32, !associated !0
+@b = global i32 0, !associated !1
+@c = internal global i32 1, !associated !2
+@d = global i32 0, !associated !3
+@f = global i32 0, !associated !4
+
+!0 = !{ptr @b}
+!1 = !{ptr @a}
+!2 = !{ptr @d}
+!3 = !{ptr @c}
+!4 = !{ptr @a, ptr @b}
Index: llvm/test/Linker/Inputs/associated-global-aix-xcoff.ll
===
--- /dev/null
+++ llvm/test/Linker/Inputs/associated-global-aix-xcoff.ll
@@ -0,0 +1,13 @@
+target triple = "unknown-unknown-aix-xcoff"
+
+@a = global i32 0, !associated !0
+@b = external global i32, !associated !1
+@c = internal global i32 1, !associated !2
+@e = global i32 0, !associated !3
+@g = global i32 0, !associated !4
+
+!0 = !{ptr @b}
+!1 = !{ptr @a}
+!2 = !{ptr @e}
+!3 = !{ptr @c}
+!4 = distinct !{null}
Index: llvm/test/Assembler/associated-metadata-aix-xcoff.ll
===
--- /dev/null
+++ llvm/test/Assembler/associated-metadata-aix-xcoff.ll
@@ -0,0 +1,103 @@
+; RUN: llvm-as < %s | llvm-dis | FileCheck %s
+
+target triple = "unknown-unknown-aix-xcoff"
+
+@gv.decl = external constant [8 x i8]
+@gv.def = constant [8 x i8] zeroinitializer
+
+@gv.associated.func.decl = external addrspace(1) constant [8 x i8], !associated !0
+@gv.associated.func.def = external addrspace(1) constant [8 x i8], !associated !1
+
+@gv.associated.gv.decl = external addrspace(1) constant [8 x i8], !associated !2
+@gv.associated.gv.def = external addrspace(1) constant [8 x i8], !associated !3
+
+@alias = alias i32, ptr @gv.def
+
+@gv.associated.alias.gv.def = external addrspace(1) constant [8 x i8], !associated !4
+
+@gv.associated.ali

[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2023-03-16 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 505720.
tingwang added a comment.

As verifier change and baseline test cases have been moved into 
https://reviews.llvm.org/D145767, update this patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

Files:
  clang/lib/CodeGen/CGDecl.cpp
  clang/lib/CodeGen/CGDeclCXX.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp
  clang/test/CodeGen/PowerPC/aix-ref-static-var.cpp
  clang/test/CodeGenCXX/aix-static-init-debug-info.cpp
  clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
  clang/test/CodeGenCXX/aix-static-init.cpp
  llvm/docs/LangRef.rst

Index: llvm/docs/LangRef.rst
===
--- llvm/docs/LangRef.rst
+++ llvm/docs/LangRef.rst
@@ -7330,11 +7330,12 @@
 @b = internal global i32 2, comdat $a, section "abc", !associated !0
 !0 = !{ptr @a}
 
-It does not have any effect on non-ELF targets. Non-ELF target may use
-``associated`` metadata for its own purpose. For example, on AIX XCOFF target,
-the ``associated`` metadata may indicate connection among static variables
-(static global variable, static class member etc.) and static init/term
-functions. This kind of association can be one-to-many.
+On AIX XCOFF target, the ``associated`` metadata indicates connection among
+static variables (static global variable, static class member etc.) and static
+init/term functions. This metadata lowers to ``.ref`` assembler pseudo-
+operation which prevents discarding of the functions in linker GC.
+
+It does not have any effect on other non-ELF targets.
 
 '``prof``' Metadata
 ^^^
Index: clang/test/CodeGenCXX/aix-static-init.cpp
===
--- clang/test/CodeGenCXX/aix-static-init.cpp
+++ clang/test/CodeGenCXX/aix-static-init.cpp
@@ -38,10 +38,10 @@
   }
 } // namespace test4
 
-// XFAIL-CHECK: @_ZN5test12t1E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
-// XFAIL-CHECK: @_ZN5test12t2E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
-// XFAIL-CHECK: @_ZN5test21xE = global i32 0, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
-// XFAIL-CHECK: @_ZN5test31tE = global %"struct.test3::Test3" undef, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t1E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test21xE = global i32 0, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
+// CHECK: @_ZN5test31tE = global %"struct.test3::Test3" undef, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
 // CHECK: @_ZGVZN5test41fEvE11staticLocal = internal global i64 0, align 8
 // CHECK: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I__, ptr null }]
 // CHECK: @llvm.global_dtors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__D_a, ptr null }]
@@ -53,8 +53,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] {
-// XFAIL-CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(ptr @_ZN5test12t1E)
 // CHECK:   ret void
@@ -85,8 +84,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] {
-// XFAIL-CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(ptr @_ZN5test12t2E)
 // CHECK:   ret void
@@ -120,8 +118,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] {
-// XFAIL-CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test35Test3D1Ev(ptr @_ZN5test31tE)
 // CHECK:   ret void
@@ -162,8 +159,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] {
-// XFAIL-CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+

[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2023-03-09 Thread Ting Wang via Phabricator via cfe-commits
tingwang added a comment.

Currently this patch does not work due to limit set on associated metadata 
operand count (forced to be single operand).

  commit 87f2e9448e82bbed4ac59bb61bea03256aa5f4de
  Author: Matt Arsenault 
  Date:   Mon Jan 9 12:17:38 2023 -0500
  
  Verifier: Add checks for associated metadata
  
  Also add missing assembler test for the valid cases.

I'm working on a patch to relieve this limit for TT.isOSAIX(), and will add 
that to the stack later.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125095: [Clang][AIX] Add .ref in frontend for AIX XCOFF to support `-bcdtors:csect` linker option

2023-03-09 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 504031.
tingwang added a comment.
Herald added a subscriber: hiraditya.

Rebase and update patch
(1) Update verifier check on associated metadata to allow multiple operands for 
AIX.
(2) Update test cases to use opaque pointer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125095/new/

https://reviews.llvm.org/D125095

Files:
  clang/lib/CodeGen/CGDecl.cpp
  clang/lib/CodeGen/CGDeclCXX.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp
  clang/test/CodeGen/PowerPC/aix-ref-static-var.cpp
  clang/test/CodeGenCXX/aix-static-init-debug-info.cpp
  clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
  clang/test/CodeGenCXX/aix-static-init.cpp
  llvm/docs/LangRef.rst
  llvm/lib/IR/Verifier.cpp

Index: llvm/docs/LangRef.rst
===
--- llvm/docs/LangRef.rst
+++ llvm/docs/LangRef.rst
@@ -7329,6 +7329,10 @@
 @b = internal global i32 2, comdat $a, section "abc", !associated !0
 !0 = !{ptr @a}
 
+On XCOFF target, the ``associated`` metadata indicates connection among static
+variables (static global variable, static class member etc.) and static init/
+term functions. This metadata lowers to ``.ref`` assembler pseudo-operation
+which prevents discarding of the functions in linker GC.
 
 '``prof``' Metadata
 ^^^
Index: clang/test/CodeGenCXX/aix-static-init.cpp
===
--- clang/test/CodeGenCXX/aix-static-init.cpp
+++ clang/test/CodeGenCXX/aix-static-init.cpp
@@ -38,6 +38,10 @@
   }
 } // namespace test4
 
+// CHECK: @_ZN5test12t1E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test21xE = global i32 0, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
+// CHECK: @_ZN5test31tE = global %"struct.test3::Test3" undef, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
 // CHECK: @_ZGVZN5test41fEvE11staticLocal = internal global i64 0, align 8
 // CHECK: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I__, ptr null }]
 // CHECK: @llvm.global_dtors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__D_a, ptr null }]
@@ -49,7 +53,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t1E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(ptr @_ZN5test12t1E)
 // CHECK:   ret void
@@ -80,7 +84,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test12t2E() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test15Test1D1Ev(ptr @_ZN5test12t2E)
 // CHECK:   ret void
@@ -114,7 +118,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZN5test31tE() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test35Test3D1Ev(ptr @_ZN5test31tE)
 // CHECK:   ret void
@@ -155,7 +159,7 @@
 // CHECK:   ret void
 // CHECK: }
 
-// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] {
+// CHECK: define internal void @__dtor__ZZN5test41fEvE11staticLocal() [[ATTR:#[0-9]+]] !associated ![[ASSOC2:[0-9]+]] {
 // CHECK: entry:
 // CHECK:   call void @_ZN5test45Test4D1Ev(ptr @_ZZN5test41fEvE11staticLocal)
 // CHECK:   ret void
@@ -192,3 +196,7 @@
 // CHECK:   call void @__finalize__ZN5test12t1E()
 // CHECK:   ret void
 // CHECK: }
+
+// CHECK: ![[ASSOC0]] = !{ptr @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}, ptr @{{_GLOBAL__sub_I__|_GLOBAL__D_a}}}
+// CHECK: ![[ASSOC1]] = !{ptr @_GLOBAL__sub_I__}
+// CHECK: ![[ASSOC2]] = !{ptr @_GLOBAL__D_a}
Index: clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
===
--- clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
+++ clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
@@ -44,8 +44,13 @@
 A A::instance = bar();
 } // namespace test2
 
+// CHECK: @_ZN5test12t0E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// CHECK: @_ZN5test12t2E = linkonce_odr global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
 // CHECK: @_ZGVN5test12t2E = linkonce_odr global i64 0, align 8
+// CHECK: @_ZN5test12t

[PATCH] D145767: [Verifier][NFC] Refactor check for associated metadata to allow multiple operands on AIX

2023-03-13 Thread Ting Wang via Phabricator via cfe-commits
tingwang updated this revision to Diff 504952.
tingwang added a reviewer: PowerPC.
tingwang added a comment.
Herald added subscribers: cfe-commits, jdoerfert.
Herald added a project: clang.

Address comments:
(1) Add test case to show how associated metadata will be used on AIX 
(currently not supported, marked with XFAIL).
(2) Add langref changes to explain.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145767/new/

https://reviews.llvm.org/D145767

Files:
  clang/test/CodeGen/PowerPC/aix-init-ref-null.cpp
  clang/test/CodeGen/PowerPC/aix-ref-static-var.cpp
  clang/test/CodeGenCXX/aix-static-init-debug-info.cpp
  clang/test/CodeGenCXX/aix-static-init-temp-spec-and-inline-var.cpp
  clang/test/CodeGenCXX/aix-static-init.cpp
  llvm/docs/LangRef.rst
  llvm/lib/IR/Verifier.cpp

Index: llvm/lib/IR/Verifier.cpp
===
--- llvm/lib/IR/Verifier.cpp
+++ llvm/lib/IR/Verifier.cpp
@@ -663,22 +663,29 @@
 GO->getMetadata(LLVMContext::MD_associated)) {
   Check(Associated->getNumOperands() == 1,
 "associated metadata must have one operand", &GV, Associated);
-  const Metadata *Op = Associated->getOperand(0).get();
-  Check(Op, "associated metadata must have a global value", GO, Associated);
-
-  const auto *VM = dyn_cast_or_null(Op);
-  Check(VM, "associated metadata must be ValueAsMetadata", GO, Associated);
-  if (VM) {
-Check(isa(VM->getValue()->getType()),
-  "associated value must be pointer typed", GV, Associated);
-
-const Value *Stripped = VM->getValue()->stripPointerCastsAndAliases();
-Check(isa(Stripped) || isa(Stripped),
-  "associated metadata must point to a GlobalObject", GO, Stripped);
-Check(Stripped != GO,
-  "global values should not associate to themselves", GO,
+  auto CheckAssocOperand = [this, &GV, GO, Associated](const Metadata *Op) {
+Check(Op, "associated metadata must have a global value", GO,
   Associated);
-  }
+
+const auto *VM = dyn_cast_or_null(Op);
+Check(VM, "associated metadata must be ValueAsMetadata", GO,
+  Associated);
+if (VM) {
+  Check(isa(VM->getValue()->getType()),
+"associated value must be pointer typed", GV, Associated);
+
+  const Value *Stripped = VM->getValue()->stripPointerCastsAndAliases();
+  Check(isa(Stripped) || isa(Stripped),
+"associated metadata must point to a GlobalObject", GO,
+Stripped);
+  Check(Stripped != GO,
+"global values should not associate to themselves", GO,
+Associated);
+}
+  };
+
+  for (unsigned i = 0, e = Associated->getNumOperands(); i != e; ++i)
+CheckAssocOperand(Associated->getOperand(i).get());
 }
   }
   Check(!GV.hasAppendingLinkage() || isa(GV),
Index: llvm/docs/LangRef.rst
===
--- llvm/docs/LangRef.rst
+++ llvm/docs/LangRef.rst
@@ -7318,8 +7318,6 @@
 linker-defined encapsulation symbols ``__start_`` and
 ``__stop_``.
 
-It does not have any effect on non-ELF targets.
-
 Example:
 
 .. code-block:: text
@@ -7329,6 +7327,11 @@
 @b = internal global i32 2, comdat $a, section "abc", !associated !0
 !0 = !{ptr @a}
 
+It does not have any effect on non-ELF targets. Non-ELF target may use
+``associated`` metadata for its own purpose. For example, on AIX XCOFF target,
+the ``associated`` metadata may indicate connection among static variables
+(static global variable, static class member etc.) and static init/term
+functions. This kind of association can be one-to-many.
 
 '``prof``' Metadata
 ^^^
Index: clang/test/CodeGenCXX/aix-static-init.cpp
===
--- clang/test/CodeGenCXX/aix-static-init.cpp
+++ clang/test/CodeGenCXX/aix-static-init.cpp
@@ -38,6 +38,10 @@
   }
 } // namespace test4
 
+// XFAIL-CHECK: @_ZN5test12t1E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// XFAIL-CHECK: @_ZN5test12t2E = global %"struct.test1::Test1" zeroinitializer, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
+// XFAIL-CHECK: @_ZN5test21xE = global i32 0, align {{[0-9]+}}, !associated ![[ASSOC1:[0-9]+]]
+// XFAIL-CHECK: @_ZN5test31tE = global %"struct.test3::Test3" undef, align {{[0-9]+}}, !associated ![[ASSOC0:[0-9]+]]
 // CHECK: @_ZGVZN5test41fEvE11staticLocal = internal global i64 0, align 8
 // CHECK: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I__, ptr null }]
 // CHECK: @llvm.global_dtors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__D_a, ptr null }]
@@ -50,6 +54,7 @@
 // CHECK: }
 
 // CHECK: define internal void @__dtor__ZN

[PATCH] D145767: [Verifier][NFC] Refactor check for associated metadata to allow multiple operands on AIX

2023-06-12 Thread Ting Wang via Phabricator via cfe-commits
tingwang planned changes to this revision.
tingwang added a comment.

Hi Matt @arsenm, thank you for your comments, and I will incorporate your 
suggestions in the next version.

Since the review process of dependent patch https://reviews.llvm.org/D125095 
didn't get any progress for a long time, I'm worried that discussing the 
verification logic maybe a little bit early. I plan to suspend this patch until 
I got any progress on D125095 . Thank you!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145767/new/

https://reviews.llvm.org/D145767

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits