date:20240118

[clang] [clang][ASTImporter] Improve structural equivalence of overloadable operators. (PR #72242)

2024-01-18 Thread Balázs Kéri via cfe-commits


https://github.com/balazske closed 
https://github.com/llvm/llvm-project/pull/72242
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 9ca1a08 - [clang][ASTImporter] Improve structural equivalence of overloadable operators. (#72242)

2024-01-18 Thread via cfe-commits


Author: Balázs Kéri
Date: 2024-01-18T09:20:05+01:00
New Revision: 9ca1a08144a3caea8fd2f45fd4930ca796cf4166

URL: 
https://github.com/llvm/llvm-project/commit/9ca1a08144a3caea8fd2f45fd4930ca796cf4166
DIFF: 
https://github.com/llvm/llvm-project/commit/9ca1a08144a3caea8fd2f45fd4930ca796cf4166.diff

LOG: [clang][ASTImporter] Improve structural equivalence of overloadable 
operators. (#72242)

Operators that are overloadable may be parsed as `CXXOperatorCallExpr`
or as `UnaryOperator` (or `BinaryOperator`). This depends on the context
and can be different if a similar construct is imported into an existing
AST. The two "forms" of the operator call AST nodes should be detected
as equivalent to allow AST import of these cases.

This fix has probably other consequences because if a structure is
imported that has `CXXOperatorCallExpr` into an AST with an existing
similar structure that has `UnaryOperator` (or binary), the additional
data in the `CXXOperatorCallExpr` node is lost at the import (because
the existing node will be used). I am not sure if this can cause
problems.

Added: 


Modified: 
clang/lib/AST/ASTStructuralEquivalence.cpp
clang/unittests/AST/StructuralEquivalenceTest.cpp

Removed: 




diff  --git a/clang/lib/AST/ASTStructuralEquivalence.cpp 
b/clang/lib/AST/ASTStructuralEquivalence.cpp
index a9e0d1698a9178d..5ec4a66879c0208 100644
--- a/clang/lib/AST/ASTStructuralEquivalence.cpp
+++ b/clang/lib/AST/ASTStructuralEquivalence.cpp
@@ -98,6 +98,8 @@ static bool 
IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
  QualType T1, QualType T2);
 static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
  Decl *D1, Decl *D2);
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+ const Stmt *S1, const Stmt *S2);
 static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
  const TemplateArgument &Arg1,
  const TemplateArgument &Arg2);
@@ -437,12 +439,67 @@ class StmtComparer {
 };
 } // namespace
 
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+ const UnaryOperator *E1,
+ const CXXOperatorCallExpr *E2) {
+  return UnaryOperator::getOverloadedOperator(E1->getOpcode()) ==
+ E2->getOperator() &&
+ IsStructurallyEquivalent(Context, E1->getSubExpr(), E2->getArg(0));
+}
+
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+ const CXXOperatorCallExpr *E1,
+ const UnaryOperator *E2) {
+  return E1->getOperator() ==
+ UnaryOperator::getOverloadedOperator(E2->getOpcode()) &&
+ IsStructurallyEquivalent(Context, E1->getArg(0), E2->getSubExpr());
+}
+
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+ const BinaryOperator *E1,
+ const CXXOperatorCallExpr *E2) {
+  return BinaryOperator::getOverloadedOperator(E1->getOpcode()) ==
+ E2->getOperator() &&
+ IsStructurallyEquivalent(Context, E1->getLHS(), E2->getArg(0)) &&
+ IsStructurallyEquivalent(Context, E1->getRHS(), E2->getArg(1));
+}
+
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+ const CXXOperatorCallExpr *E1,
+ const BinaryOperator *E2) {
+  return E1->getOperator() ==
+ BinaryOperator::getOverloadedOperator(E2->getOpcode()) &&
+ IsStructurallyEquivalent(Context, E1->getArg(0), E2->getLHS()) &&
+ IsStructurallyEquivalent(Context, E1->getArg(1), E2->getRHS());
+}
+
 /// Determine structural equivalence of two statements.
 static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
  const Stmt *S1, const Stmt *S2) {
   if (!S1 || !S2)
 return S1 == S2;
 
+  // Check for statements with similar syntax but 
diff erent AST.
+  // A UnaryOperator node is more lightweight than a CXXOperatorCallExpr node.
+  // The more heavyweight node is only created if the definition-time name
+  // lookup had any results. The lookup results are stored CXXOperatorCallExpr
+  // only. The lookup results can be 
diff erent in a "From" and "To" AST even if
+  // the compared structure is otherwise equivalent. For this reason we must
+  // treat a similar unary/binary operator node and CXXOperatorCall node as
+  // equivalent.
+  if (const auto *E2CXXOperatorCall = dyn_cast(S2)) {
+if (const auto *E1Unary = dyn_cast(S1))
+  return IsStructurallyEquivalent(Context, E1Unary, E2CXXOperatorCall);
+

[clang] [llvm] [clang-tools-extra] [CGP] Avoid replacing a free ext with multiple other exts. (PR #77094)

2024-01-18 Thread Florian Hahn via cfe-commits


https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/77094

>From 46fbecfce6c48795ea85fc9420067479f6d0b17a Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Fri, 5 Jan 2024 11:24:59 +
Subject: [PATCH] [CGP] Avoid replacing a free ext with multiple other exts.

Replacing a free extension with 2 or more extensions unnecessarily
increases the number of IR instructions without providing any benefits.
It also unnecessarily causes operations to be performed on wider types
than necessary.

In some cases, the extra extensions also pessimize codegen (see
bfis-in-loop.ll).

The changes in arm64-codegen-prepare-extload.ll also show that we avoid
promotions that should only be performed in stress mode.
---
 llvm/lib/CodeGen/CodeGenPrepare.cpp   |  7 ++-
 .../AArch64/arm64-codegen-prepare-extload.ll  | 22 ---
 .../AArch64/avoid-free-ext-promotion.ll   | 12 ++--
 llvm/test/CodeGen/AArch64/bfis-in-loop.ll | 50 +++
 ...iller-impdef-on-implicit-def-regression.ll | 63 +--
 5 files changed, 81 insertions(+), 73 deletions(-)

diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp 
b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index 5bd4c6b067d796..606946ceffd4f3 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -5965,7 +5965,9 @@ bool CodeGenPrepare::tryToPromoteExts(
 // cut this search path, because it means we degrade the code quality.
 // With exactly 2, the transformation is neutral, because we will merge
 // one extension but leave one. However, we optimistically keep going,
-// because the new extension may be removed too.
+// because the new extension may be removed too. Also avoid replacing a
+// single free extension with multiple extensions, as this increases the
+// number of IR instructions while providing any savings.
 long long TotalCreatedInstsCost = CreatedInstsCost + NewCreatedInstsCost;
 // FIXME: It would be possible to propagate a negative value instead of
 // conservatively ceiling it to 0.
@@ -5973,7 +5975,8 @@ bool CodeGenPrepare::tryToPromoteExts(
 std::max((long long)0, (TotalCreatedInstsCost - ExtCost));
 if (!StressExtLdPromotion &&
 (TotalCreatedInstsCost > 1 ||
- !isPromotedInstructionLegal(*TLI, *DL, PromotedVal))) {
+ !isPromotedInstructionLegal(*TLI, *DL, PromotedVal) ||
+ (ExtCost == 0 && NewExts.size() > 1))) {
   // This promotion is not profitable, rollback to the previous state, and
   // save the current extension in ProfitablyMovedExts as the latest
   // speculative promotion turned out to be unprofitable.
diff --git a/llvm/test/CodeGen/AArch64/arm64-codegen-prepare-extload.ll 
b/llvm/test/CodeGen/AArch64/arm64-codegen-prepare-extload.ll
index 23cbad0d15b4c1..646f988f574813 100644
--- a/llvm/test/CodeGen/AArch64/arm64-codegen-prepare-extload.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-codegen-prepare-extload.ll
@@ -528,10 +528,14 @@ entry:
 ; OPTALL: [[LD:%[a-zA-Z_0-9-]+]] = load i8, ptr %p
 ;
 ; This transformation should really happen only for stress mode.
-; OPT-NEXT: [[ZEXT64:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i64
-; OPT-NEXT: [[ZEXTB:%[a-zA-Z_0-9-]+]] = zext i32 %b to i64
-; OPT-NEXT: [[IDX64:%[a-zA-Z_0-9-]+]] = add nuw i64 [[ZEXT64]], [[ZEXTB]]
-; OPT-NEXT: [[RES32:%[a-zA-Z_0-9-]+]] = trunc i64 [[IDX64]] to i32
+; STRESS-NEXT: [[ZEXT64:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i64
+; STRESS-NEXT: [[ZEXTB:%[a-zA-Z_0-9-]+]] = zext i32 %b to i64
+; STRESS-NEXT: [[IDX64:%[a-zA-Z_0-9-]+]] = add nuw i64 [[ZEXT64]], [[ZEXTB]]
+; STRESS-NEXT: [[RES32:%[a-zA-Z_0-9-]+]] = trunc i64 [[IDX64]] to i32
+;
+; NONSTRESS-NEXT: [[ZEXT32:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i32
+; NONSTRESS-NEXT: [[RES32:%[a-zA-Z_0-9-]+]] = add nuw i32 [[ZEXT32]], %b
+; NONSTRESS-NEXT: [[IDX64:%[a-zA-Z_0-9-]+]] = zext i32 [[RES32]] to i64
 ;
 ; DISABLE-NEXT: [[ZEXT32:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i32
 ; DISABLE-NEXT: [[RES32:%[a-zA-Z_0-9-]+]] = add nuw i32 [[ZEXT32]], %b
@@ -583,9 +587,13 @@ entry:
 ; OPTALL: [[LD:%[a-zA-Z_0-9-]+]] = load i8, ptr %p
 ;
 ; This transformation should really happen only for stress mode.
-; OPT-NEXT: [[ZEXT64:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i64
-; OPT-NEXT: [[ZEXTB:%[a-zA-Z_0-9-]+]] = zext i32 %b to i64
-; OPT-NEXT: [[IDX64:%[a-zA-Z_0-9-]+]] = add nuw i64 [[ZEXT64]], [[ZEXTB]]
+; STRESS-NEXT: [[ZEXT64:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i64
+; STRESS-NEXT: [[ZEXTB:%[a-zA-Z_0-9-]+]] = zext i32 %b to i64
+; STRESS-NEXT: [[IDX64:%[a-zA-Z_0-9-]+]] = add nuw i64 [[ZEXT64]], [[ZEXTB]]
+;
+; NONSTRESS-NEXT: [[ZEXT32:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i32
+; NONSTRESS-NEXT: [[RES32:%[a-zA-Z_0-9-]+]] = add nuw i32 [[ZEXT32]], %b
+; NONSTRESS-NEXT: [[IDX64:%[a-zA-Z_0-9-]+]] = zext i32 [[RES32]] to i64
 ;
 ; DISABLE-NEXT: [[ZEXT32:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i32
 ; DISABLE-NEXT: [[RES32:%[a-zA-Z_0-9-]+]] = add nuw i32 [[ZEXT32]], %b
diff --git a/llvm/test/CodeGen/AArch64/a

[clang] Warning for incorrect useof 'pure' attribute (PR #78200)

2024-01-18 Thread via cfe-commits



@@ -11802,6 +11802,27 @@ static bool CheckMultiVersionFunction(Sema &S, 
FunctionDecl *NewFD,
  OldDecl, Previous);
 }
 
+static void CheckFunctionDeclarationAttributesUsage(Sema &S,
+FunctionDecl *NewFD) {
+  bool IsPure = NewFD->hasAttr();
+  bool IsConst = NewFD->hasAttr();
+
+  if (IsPure && IsConst) {
+S.Diag(NewFD->getLocation(), diag::warn_const_attr_with_pure_attr);
+NewFD->dropAttr();
+  }
+  if (IsPure || IsConst) {

kelbon wrote:

Early return is good, but i think it may be customization point, where 
different attributes/contract checks will be in future, so early return will 
cause more changes in future


https://github.com/llvm/llvm-project/pull/78200
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Warning for incorrect useof 'pure' attribute (PR #78200)

2024-01-18 Thread via cfe-commits



@@ -692,6 +692,13 @@ def warn_maybe_falloff_nonvoid_function : Warning<
 def warn_falloff_nonvoid_function : Warning<
   "non-void function does not return a value">,
   InGroup;
+def warn_const_attr_with_pure_attr : Warning<
+  "'const' attribute imposes more restrictions, 'pure' attribute ignored">,
+  InGroup;

kelbon wrote:

But it is same case - incorrect 'pure' usage

https://github.com/llvm/llvm-project/pull/78200
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-18 Thread Piotr Sobczak via cfe-commits


piotrAMD wrote:

Discussed it some more internally and the agreement was to keep the "global" 
and have one intrinsic for both instructions. Just updated the PR to reflect 
that - this effectively reverts the previous update.

https://github.com/llvm/llvm-project/pull/2
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clangd] Don't collect templated decls for builtin templates (PR #78466)

2024-01-18 Thread Haojian Wu via cfe-commits


https://github.com/hokein approved this pull request.


https://github.com/llvm/llvm-project/pull/78466
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clangd] Don't collect templated decls for builtin templates (PR #78466)

2024-01-18 Thread Haojian Wu via cfe-commits


https://github.com/hokein edited https://github.com/llvm/llvm-project/pull/78466
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clangd] Don't collect templated decls for builtin templates (PR #78466)

2024-01-18 Thread Haojian Wu via cfe-commits



@@ -443,9 +443,15 @@ struct TargetFinder {
   Outer.add(TST->getAliasedType(), Flags | Rel::Underlying);
   // Don't *traverse* the alias, which would result in traversing the
   // template of the underlying type.
-  Outer.report(
-  TST->getTemplateName().getAsTemplateDecl()->getTemplatedDecl(),
-  Flags | Rel::Alias | Rel::TemplatePattern);
+
+  // Builtin templates e.g. __make_integer_seq, __type_pack_element
+  // are such that they don't have alias *decls*. Even then, we still
+  // traverse their desugared *types* so that instantiated decls are
+  // collected.
+  if (NamedDecl *D = TST->getTemplateName()

hokein wrote:

nit: maybe write the code like below, being explicit about the builtin template 
cases.
```
const TemplateDecl *TD = TST->getTemplateName().getAsTemplateDecl();
if (isa(TD))
return;
...
```

https://github.com/llvm/llvm-project/pull/78466
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-18 Thread Nikita Popov via cfe-commits


https://github.com/nikic requested changes to this pull request.

Using anything but the size and alignment of the alloca type in a way that 
affects program semantics is illegal.

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-18 Thread Changpeng Fang via cfe-commits


https://github.com/changpeng approved this pull request.


https://github.com/llvm/llvm-project/pull/2
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Warning for incorrect useof 'pure' attribute (PR #78200)

2024-01-18 Thread Chuanqi Xu via cfe-commits



@@ -692,6 +692,13 @@ def warn_maybe_falloff_nonvoid_function : Warning<
 def warn_falloff_nonvoid_function : Warning<
   "non-void function does not return a value">,
   InGroup;
+def warn_const_attr_with_pure_attr : Warning<
+  "'const' attribute imposes more restrictions, 'pure' attribute ignored">,
+  InGroup;

ChuanqiXu9 wrote:

While this is not required, **incorrect** is too broad. e.g., a lot of clang's 
change  can be summarized as fixing incorrect  implementation for C++.

https://github.com/llvm/llvm-project/pull/78200
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Warning for incorrect useof 'pure' attribute (PR #78200)

2024-01-18 Thread Chuanqi Xu via cfe-commits



@@ -11802,6 +11802,27 @@ static bool CheckMultiVersionFunction(Sema &S, 
FunctionDecl *NewFD,
  OldDecl, Previous);
 }
 
+static void CheckFunctionDeclarationAttributesUsage(Sema &S,
+FunctionDecl *NewFD) {
+  bool IsPure = NewFD->hasAttr();
+  bool IsConst = NewFD->hasAttr();
+
+  if (IsPure && IsConst) {
+S.Diag(NewFD->getLocation(), diag::warn_const_attr_with_pure_attr);
+NewFD->dropAttr();
+  }
+  if (IsPure || IsConst) {

ChuanqiXu9 wrote:

Then we should handle different attributes in different places.

https://github.com/llvm/llvm-project/pull/78200
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 085eae6 - [C++20] [Modules] Allow to merge enums with the same underlying interger types

2024-01-18 Thread Chuanqi Xu via cfe-commits


Author: Chuanqi Xu
Date: 2024-01-18T17:09:35+08:00
New Revision: 085eae6b863881fb9fda323e5b672b04a00ed19e

URL: 
https://github.com/llvm/llvm-project/commit/085eae6b863881fb9fda323e5b672b04a00ed19e
DIFF: 
https://github.com/llvm/llvm-project/commit/085eae6b863881fb9fda323e5b672b04a00ed19e.diff

LOG: [C++20] [Modules] Allow to merge enums with the same underlying interger 
types

Close https://github.com/llvm/llvm-project/issues/76638. See the issue
for the context of the change.

Added: 
clang/test/Modules/pr76638.cppm

Modified: 
clang/lib/AST/ODRHash.cpp

Removed: 




diff  --git a/clang/lib/AST/ODRHash.cpp b/clang/lib/AST/ODRHash.cpp
index aea1a93ae1fa828..07677d655c5afd6 100644
--- a/clang/lib/AST/ODRHash.cpp
+++ b/clang/lib/AST/ODRHash.cpp
@@ -741,8 +741,55 @@ void ODRHash::AddEnumDecl(const EnumDecl *Enum) {
   if (Enum->isScoped())
 AddBoolean(Enum->isScopedUsingClassTag());
 
-  if (Enum->getIntegerTypeSourceInfo())
-AddQualType(Enum->getIntegerType());
+  if (Enum->getIntegerTypeSourceInfo()) {
+// FIMXE: This allows two enums with 
diff erent spellings to have the same
+// hash.
+//
+//  // mod1.cppm
+//  module;
+//  extern "C" {
+//  typedef unsigned __int64 size_t;
+//  }
+//  namespace std {
+//  using :: size_t;
+//  }
+//
+//  extern "C++" {
+//  namespace std {
+//  enum class align_val_t : std::size_t {};
+//  }
+//  }
+//
+//  export module mod1;
+//  export using std::align_val_t;
+//
+//  // mod2.cppm
+//  module;
+//  extern "C" {
+//  typedef unsigned __int64 size_t;
+//  }
+//
+//  extern "C++" {
+//  namespace std {
+//  enum class align_val_t : size_t {};
+//  }
+//  }
+//
+//  export module mod2;
+//  import mod1;
+//  export using std::align_val_t;
+//
+// The above example should be disallowed since it violates
+// [basic.def.odr]p14:
+//
+//Each such definition shall consist of the same sequence of tokens
+//
+// The definitions of `std::align_val_t` in two module units have 
diff erent
+// spellings but we failed to give an error here.
+//
+// See https://github.com/llvm/llvm-project/issues/76638 for details.
+AddQualType(Enum->getIntegerType().getCanonicalType());
+  }
 
   // Filter out sub-Decls which will not be processed in order to get an
   // accurate count of Decl's.

diff  --git a/clang/test/Modules/pr76638.cppm b/clang/test/Modules/pr76638.cppm
new file mode 100644
index 000..8cc807961421b7c
--- /dev/null
+++ b/clang/test/Modules/pr76638.cppm
@@ -0,0 +1,69 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+//
+// RUN: %clang_cc1 -std=c++20 %t/mod1.cppm -emit-module-interface -o 
%t/mod1.pcm
+// RUN: %clang_cc1 -std=c++20 %t/mod2.cppm -fmodule-file=mod1=%t/mod1.pcm \
+// RUN: -fsyntax-only -verify
+//
+// RUN: %clang_cc1 -std=c++20 %t/mod3.cppm -emit-module-interface -o 
%t/mod3.pcm
+// RUN: %clang_cc1 -std=c++20 %t/mod4.cppm -fmodule-file=mod3=%t/mod3.pcm \
+// RUN: -fsyntax-only -verify
+
+//--- size_t.h
+
+extern "C" {
+typedef unsigned int size_t;
+}
+
+//--- csize_t
+namespace std {
+using :: size_t;
+}
+
+//--- align.h
+namespace std {
+enum class align_val_t : size_t {};
+}
+
+//--- mod1.cppm
+module;
+#include "size_t.h"
+#include "align.h"
+export module mod1;
+export using std::align_val_t;
+
+//--- mod2.cppm
+// expected-no-diagnostics
+module;
+#include "size_t.h"
+#include "csize_t"
+#include "align.h"
+export module mod2;
+import mod1;
+export using std::align_val_t;
+
+//--- signed_size_t.h
+// Test that we can still find the case if the underlying type is 
diff erent
+extern "C" {
+typedef signed int size_t;
+}
+
+//--- mod3.cppm
+module;
+#include "size_t.h"
+#include "align.h"
+export module mod3;
+export using std::align_val_t;
+
+//--- mod4.cppm
+module;
+#include "signed_size_t.h"
+#include "csize_t"
+#include "align.h"
+export module mod4;
+import mod3;
+export using std::align_val_t;
+
+// expected-error@align.h:* {{'std::align_val_t' has 
diff erent definitions in 
diff erent modules; defined here first 
diff erence is enum with specified type 'size_t' (aka 'int')}}
+// expected-note@align.h:* {{but in 'mod3.' found enum with specified 
type 'size_t' (aka 'unsigned int')}}



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Multilib support for libraries with exceptions (PR #75031)

2024-01-18 Thread via cfe-commits


https://github.com/pwprzybyla updated 
https://github.com/llvm/llvm-project/pull/75031

>From 45db788f730d37cc12b54104dd91175553c367b2 Mon Sep 17 00:00:00 2001
From: Piotr Przybyla 
Date: Wed, 29 Nov 2023 14:05:00 +
Subject: [PATCH] Multilib support for libraries with exceptions

---
 clang/include/clang/Driver/ToolChain.h | 10 ++
 clang/lib/Driver/ToolChain.cpp | 20 +++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/clang/include/clang/Driver/ToolChain.h 
b/clang/include/clang/Driver/ToolChain.h
index 2d0c1f826c1728a..fbe2e8fe8e88d85 100644
--- a/clang/include/clang/Driver/ToolChain.h
+++ b/clang/include/clang/Driver/ToolChain.h
@@ -120,6 +120,11 @@ class ToolChain {
 RM_Disabled,
   };
 
+  enum ExceptionsMode {
+EM_Enabled,
+EM_Disabled,
+  };
+
   struct BitCodeLibraryInfo {
 std::string Path;
 bool ShouldInternalize;
@@ -141,6 +146,8 @@ class ToolChain {
 
   const RTTIMode CachedRTTIMode;
 
+  const ExceptionsMode CachedExceptionsMode;
+
   /// The list of toolchain specific path prefixes to search for libraries.
   path_list LibraryPaths;
 
@@ -318,6 +325,9 @@ class ToolChain {
   // Returns the RTTIMode for the toolchain with the current arguments.
   RTTIMode getRTTIMode() const { return CachedRTTIMode; }
 
+  // Returns the ExceptionsMode for the toolchain with the current arguments.
+  ExceptionsMode getExceptionsMode() const { return CachedExceptionsMode; }
+
   /// Return any implicit target and/or mode flag for an invocation of
   /// the compiler driver as `ProgName`.
   ///
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index ab19166f18c2dcf..f80a9e78a16b065 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -77,10 +77,19 @@ static ToolChain::RTTIMode CalculateRTTIMode(const ArgList 
&Args,
   return NoRTTI ? ToolChain::RM_Disabled : ToolChain::RM_Enabled;
 }
 
+static ToolChain::ExceptionsMode CalculateExceptionsMode(const ArgList &Args) {
+  if (Args.hasFlag(options::OPT_fexceptions, options::OPT_fno_exceptions,
+   true)) {
+return ToolChain::EM_Enabled;
+  }
+  return ToolChain::EM_Disabled;
+}
+
 ToolChain::ToolChain(const Driver &D, const llvm::Triple &T,
  const ArgList &Args)
 : D(D), Triple(T), Args(Args), CachedRTTIArg(GetRTTIArgument(Args)),
-  CachedRTTIMode(CalculateRTTIMode(Args, Triple, CachedRTTIArg)) {
+  CachedRTTIMode(CalculateRTTIMode(Args, Triple, CachedRTTIArg)),
+  CachedExceptionsMode(CalculateExceptionsMode(Args)) {
   auto addIfExists = [this](path_list &List, const std::string &Path) {
 if (getVFS().exists(Path))
   List.push_back(Path);
@@ -264,6 +273,15 @@ ToolChain::getMultilibFlags(const llvm::opt::ArgList 
&Args) const {
 break;
   }
 
+  // Include fno-exceptions and fno-rtti
+  // to improve multilib selection
+  if (getRTTIMode() == ToolChain::RTTIMode::RM_Disabled) {
+Result.push_back("-fno-rtti");
+  }
+  if (getExceptionsMode() == ToolChain::ExceptionsMode::EM_Disabled) {
+Result.push_back("-fno-exceptions");
+  }
+
   // Sort and remove duplicates.
   std::sort(Result.begin(), Result.end());
   Result.erase(std::unique(Result.begin(), Result.end()), Result.end());

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Sema] fix crash of attribute transform (PR #78088)

2024-01-18 Thread Qizhi Hu via cfe-commits


https://github.com/jcsxky updated 
https://github.com/llvm/llvm-project/pull/78088

>From 7632f631ec949f91d2ecec2b6f16951f9288920f Mon Sep 17 00:00:00 2001
From: huqizhi 
Date: Sun, 14 Jan 2024 15:07:26 +0800
Subject: [PATCH] [Clang][Sema] fix crash of attribute transform

---
 clang/include/clang/AST/TypeLoc.h   |  4 
 clang/lib/Sema/TreeTransform.h  | 15 ++-
 clang/test/Sema/attr-lifetimebound-no-crash.cpp | 17 +
 3 files changed, 31 insertions(+), 5 deletions(-)
 create mode 100644 clang/test/Sema/attr-lifetimebound-no-crash.cpp

diff --git a/clang/include/clang/AST/TypeLoc.h 
b/clang/include/clang/AST/TypeLoc.h
index 471deb14aba51f..04780fdeae3bc1 100644
--- a/clang/include/clang/AST/TypeLoc.h
+++ b/clang/include/clang/AST/TypeLoc.h
@@ -884,6 +884,10 @@ class AttributedTypeLoc : public 
ConcreteTypeLocgetEquivalentType(), getNonLocalData());
+  }
+
   /// The type attribute.
   const Attr *getAttr() const {
 return getLocalData()->TypeAttr;
diff --git a/clang/lib/Sema/TreeTransform.h b/clang/lib/Sema/TreeTransform.h
index 1a1bc87d2b3203..1c3ce88311fb81 100644
--- a/clang/lib/Sema/TreeTransform.h
+++ b/clang/lib/Sema/TreeTransform.h
@@ -6124,7 +6124,11 @@ QualType 
TreeTransform::TransformFunctionProtoType(
   //   "pointer to cv-qualifier-seq X" between the optional cv-qualifer-seq
   //   and the end of the function-definition, member-declarator, or
   //   declarator.
-  Sema::CXXThisScopeRAII ThisScope(SemaRef, ThisContext, ThisTypeQuals);
+  auto *RD =
+  dyn_cast_or_null(SemaRef.getCurLexicalContext());
+  Sema::CXXThisScopeRAII ThisScope(
+  SemaRef, ThisContext == nullptr && nullptr != RD ? RD : ThisContext,
+  ThisTypeQuals);
 
   ResultType = getDerived().TransformType(TLB, TL.getReturnLoc());
   if (ResultType.isNull())
@@ -7081,10 +7085,10 @@ QualType 
TreeTransform::TransformAttributedType(
   // FIXME: dependent operand expressions?
   if (getDerived().AlwaysRebuild() ||
   modifiedType != oldType->getModifiedType()) {
-// TODO: this is really lame; we should really be rebuilding the
-// equivalent type from first principles.
-QualType equivalentType
-  = getDerived().TransformType(oldType->getEquivalentType());
+TypeLocBuilder AuxiliaryTLB;
+AuxiliaryTLB.reserve(TL.getFullDataSize());
+QualType equivalentType =
+getDerived().TransformType(AuxiliaryTLB, TL.getEquivalentTypeLoc());
 if (equivalentType.isNull())
   return QualType();
 
@@ -7103,6 +7107,7 @@ QualType TreeTransform::TransformAttributedType(
 result = SemaRef.Context.getAttributedType(TL.getAttrKind(),
modifiedType,
equivalentType);
+TLB.TypeWasModifiedSafely(result);
   }
 
   AttributedTypeLoc newTL = TLB.push(result);
diff --git a/clang/test/Sema/attr-lifetimebound-no-crash.cpp 
b/clang/test/Sema/attr-lifetimebound-no-crash.cpp
new file mode 100644
index 00..e668a78790defd
--- /dev/null
+++ b/clang/test/Sema/attr-lifetimebound-no-crash.cpp
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 %s -verify -fsyntax-only
+
+// expected-no-diagnostics
+
+template
+struct Bar {
+int* data;
+
+auto operator[](const int index) const [[clang::lifetimebound]] -> 
decltype(data[index]) {
+return data[index];
+}
+};
+
+int main() {
+Bar b;
+(void)b[2];
+}

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [flang] [compiler-rt] [clang] [clang-tools-extra] [flang] use setsid to assign the child to prevent zombie as it will be clean up by init process (PR #77944)

2024-01-18 Thread Yi Wu via cfe-commits


https://github.com/yi-wu-arm updated 
https://github.com/llvm/llvm-project/pull/77944

>From b51f293d57a1ae96fab5d3b2a529186a78643c8c Mon Sep 17 00:00:00 2001
From: Yi Wu 
Date: Fri, 12 Jan 2024 16:44:21 +
Subject: [PATCH 1/3] use setsid to assign the child to prevent zombie as it
 will be clean up by init process

---
 flang/runtime/execute.cpp | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/flang/runtime/execute.cpp b/flang/runtime/execute.cpp
index 48773ae8114b0b7..1bd5bb81ec84618 100644
--- a/flang/runtime/execute.cpp
+++ b/flang/runtime/execute.cpp
@@ -180,8 +180,6 @@ void RTNAME(ExecuteCommandLine)(const Descriptor &command, 
bool wait,
 }
 FreeMemory((void *)wcmd);
 #else
-// terminated children do not become zombies
-signal(SIGCHLD, SIG_IGN);
 pid_t pid{fork()};
 if (pid < 0) {
   if (!cmdstat) {
@@ -191,6 +189,18 @@ void RTNAME(ExecuteCommandLine)(const Descriptor &command, 
bool wait,
 CheckAndCopyCharsToDescriptor(cmdmsg, "Fork failed");
   }
 } else if (pid == 0) {
+  if (setsid() == -1) {
+if (!cmdstat) {
+  terminator.Crash(
+  "setsid() failed with errno: %d, asynchronous process initiation 
failed.",
+  errno);
+} else {
+  StoreIntToDescriptor(cmdstat, ASYNC_NO_SUPPORT_ERR, terminator);
+  CheckAndCopyCharsToDescriptor(
+  cmdmsg, "setsid() failed, asynchronous process initiation 
failed.");
+}
+exit(EXIT_FAILURE);
+  }
   int status{std::system(newCmd)};
   TerminationCheck(status, cmdstat, cmdmsg, terminator);
   exit(status);

>From 9682eb49bcd77e70439accd2eaa4524fea5cdfe5 Mon Sep 17 00:00:00 2001
From: Yi Wu 
Date: Fri, 12 Jan 2024 16:58:29 +
Subject: [PATCH 2/3] clang-format

---
 flang/runtime/execute.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/flang/runtime/execute.cpp b/flang/runtime/execute.cpp
index 1bd5bb81ec84618..d149b5d47ef7542 100644
--- a/flang/runtime/execute.cpp
+++ b/flang/runtime/execute.cpp
@@ -191,13 +191,13 @@ void RTNAME(ExecuteCommandLine)(const Descriptor 
&command, bool wait,
 } else if (pid == 0) {
   if (setsid() == -1) {
 if (!cmdstat) {
-  terminator.Crash(
-  "setsid() failed with errno: %d, asynchronous process initiation 
failed.",
+  terminator.Crash("setsid() failed with errno: %d, asynchronous "
+   "process initiation failed.",
   errno);
 } else {
   StoreIntToDescriptor(cmdstat, ASYNC_NO_SUPPORT_ERR, terminator);
-  CheckAndCopyCharsToDescriptor(
-  cmdmsg, "setsid() failed, asynchronous process initiation 
failed.");
+  CheckAndCopyCharsToDescriptor(cmdmsg,
+  "setsid() failed, asynchronous process initiation failed.");
 }
 exit(EXIT_FAILURE);
   }

>From b8f4db41db6ceb10897f113243d4a0954d727dc7 Mon Sep 17 00:00:00 2001
From: Yi Wu 
Date: Fri, 12 Jan 2024 17:01:48 +
Subject: [PATCH 3/3] add comment

---
 flang/runtime/execute.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/flang/runtime/execute.cpp b/flang/runtime/execute.cpp
index d149b5d47ef7542..f455cf8b0e88cac 100644
--- a/flang/runtime/execute.cpp
+++ b/flang/runtime/execute.cpp
@@ -189,6 +189,7 @@ void RTNAME(ExecuteCommandLine)(const Descriptor &command, 
bool wait,
 CheckAndCopyCharsToDescriptor(cmdmsg, "Fork failed");
   }
 } else if (pid == 0) {
+  // Create a new session, let init process take care of zombie child
   if (setsid() == -1) {
 if (!cmdstat) {
   terminator.Crash("setsid() failed with errno: %d, asynchronous "

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [flang] [compiler-rt] [clang] [clang-tools-extra] [flang] use setsid to assign the child to prevent zombie as it will be clean up by init process (PR #77944)

2024-01-18 Thread Yi Wu via cfe-commits


yi-wu-arm wrote:

Sorry, let me rebase on main, there are patches has been uploaded to solve this 
problem.

The problem is listed here: https://github.com/llvm/llvm-project/issues/77803 
In short, once a async ` EXECUTE_COMMAND_LINE` is called, all future 
`EXECUTE_COMMAND_LINE` will have a `cmdstat` of 2 (execution error) because 
`std:::system` return -1. It will fail on gfortran llvm test suite.

A simple reproducer would be 
```fortran
program test()
call execute_command_line("echo hi", .false.)
call execute_command_line("echo hi")
end program test
```
console output
```
hi

fatal Fortran runtime 
error(/home/yiwu02/gitrepo/llvm-project/test_fortran_code/test.f90:3): 
Execution error with system status code: -1
hi

fatal Fortran runtime 
error(/home/yiwu02/gitrepo/llvm-project/test_fortran_code/test.f90:2): 
Execution error with system status code: -1
Aborted (core dumped)
```


https://github.com/llvm/llvm-project/pull/77944
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libc] [llvm] [flang] [compiler-rt] [clang] [clang-tools-extra] Apply kind code check on exitstat and cmdstat (PR #78286)

2024-01-18 Thread Yi Wu via cfe-commits


https://github.com/yi-wu-arm updated 
https://github.com/llvm/llvm-project/pull/78286

>From d56eca56c8e4c64e649febc43e2c48b6e5146680 Mon Sep 17 00:00:00 2001
From: Yi Wu 
Date: Tue, 16 Jan 2024 14:08:00 +
Subject: [PATCH 1/6] change exitstat and cmsstat from AnyInt to DefaultInt

---
 flang/lib/Evaluate/intrinsics.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/flang/lib/Evaluate/intrinsics.cpp 
b/flang/lib/Evaluate/intrinsics.cpp
index da6d5970089884..0b9bdac88a78dc 100644
--- a/flang/lib/Evaluate/intrinsics.cpp
+++ b/flang/lib/Evaluate/intrinsics.cpp
@@ -1314,9 +1314,9 @@ static const IntrinsicInterface intrinsicSubroutine[]{
 {"execute_command_line",
 {{"command", DefaultChar, Rank::scalar},
 {"wait", AnyLogical, Rank::scalar, Optionality::optional},
-{"exitstat", AnyInt, Rank::scalar, Optionality::optional,
+{"exitstat", DefaultInt, Rank::scalar, Optionality::optional,
 common::Intent::InOut},
-{"cmdstat", AnyInt, Rank::scalar, Optionality::optional,
+{"cmdstat", DefaultInt, Rank::scalar, Optionality::optional,
 common::Intent::Out},
 {"cmdmsg", DefaultChar, Rank::scalar, Optionality::optional,
 common::Intent::InOut}},

>From 2741652cae00ca1a94ae7a3310af1f25308e8105 Mon Sep 17 00:00:00 2001
From: Yi Wu 
Date: Tue, 16 Jan 2024 16:54:42 +
Subject: [PATCH 2/6] add KindCode::greaterEqualToKind

Now execute_command_line will accept exitstat kind>=4, cmdstat kind >=2
---
 flang/lib/Evaluate/intrinsics.cpp | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/flang/lib/Evaluate/intrinsics.cpp 
b/flang/lib/Evaluate/intrinsics.cpp
index 0b9bdac88a78dc..947e31967bdf45 100644
--- a/flang/lib/Evaluate/intrinsics.cpp
+++ b/flang/lib/Evaluate/intrinsics.cpp
@@ -77,7 +77,7 @@ static constexpr CategorySet AnyType{IntrinsicType | 
DerivedType};
 
 ENUM_CLASS(KindCode, none, defaultIntegerKind,
 defaultRealKind, // is also the default COMPLEX kind
-doublePrecision, defaultCharKind, defaultLogicalKind,
+doublePrecision, defaultCharKind, defaultLogicalKind, 
greaterAndEqualToKind,
 any, // matches any kind value; each instance is independent
 // match any kind, but all "same" kinds must be equal. For characters, also
 // implies that lengths must be equal.
@@ -104,7 +104,8 @@ ENUM_CLASS(KindCode, none, defaultIntegerKind,
 struct TypePattern {
   CategorySet categorySet;
   KindCode kindCode{KindCode::none};
-  int exactKindValue{0}; // for KindCode::exactKind
+  int kindValue{
+  0}; // for KindCode::exactKind and KindCode::greaterAndEqualToKind
   llvm::raw_ostream &Dump(llvm::raw_ostream &) const;
 };
 
@@ -1314,10 +1315,12 @@ static const IntrinsicInterface intrinsicSubroutine[]{
 {"execute_command_line",
 {{"command", DefaultChar, Rank::scalar},
 {"wait", AnyLogical, Rank::scalar, Optionality::optional},
-{"exitstat", DefaultInt, Rank::scalar, Optionality::optional,
-common::Intent::InOut},
-{"cmdstat", DefaultInt, Rank::scalar, Optionality::optional,
-common::Intent::Out},
+{"exitstat",
+TypePattern{IntType, KindCode::greaterAndEqualToKind, 4},
+Rank::scalar, Optionality::optional, common::Intent::InOut},
+{"cmdstat",
+TypePattern{IntType, KindCode::greaterAndEqualToKind, 2},
+Rank::scalar, Optionality::optional, common::Intent::Out},
 {"cmdmsg", DefaultChar, Rank::scalar, Optionality::optional,
 common::Intent::InOut}},
 {}, Rank::elemental, IntrinsicClass::impureSubroutine},
@@ -1834,7 +1837,10 @@ std::optional IntrinsicInterface::Match(
   argOk = true;
   break;
 case KindCode::exactKind:
-  argOk = type->kind() == d.typePattern.exactKindValue;
+  argOk = type->kind() == d.typePattern.kindValue;
+  break;
+case KindCode::greaterAndEqualToKind:
+  argOk = type->kind() >= d.typePattern.kindValue;
   break;
 case KindCode::sameAtom:
   if (!sameArg) {
@@ -2177,8 +2183,9 @@ std::optional IntrinsicInterface::Match(
   resultType = DynamicType{
   GetBuiltinDerivedType(builtinsScope, "__builtin_team_type")};
   break;
+case KindCode::greaterAndEqualToKind:
 case KindCode::exactKind:
-  resultType = DynamicType{*category, result.exactKindValue};
+  resultType = DynamicType{*category, result.kindValue};
   break;
 case KindCode::typeless:
 case KindCode::any:

>From cea484080cba83ad32abb5622048b1864e4a49dc Mon Sep 17 00:00:00 2001
From: Yi Wu 
Date: Tue, 16 Jan 2024 17:25:12 +
Subject: [PATCH 3/6] doc fixes

---
 flang/docs/Intrinsics.md | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/flang/docs/Intrinsics.md b/flang/docs/Intrinsics.md
index 5ade2574032977..981

[clang] [clang-tools-extra] [clang][NFC] Refactor `CXXNewExpr::InitializationStyle` (re-land) (PR #71417)

2024-01-18 Thread Vlad Serebrennikov via cfe-commits

Endilll wrote:

> I'd qualify this as a regression, by looking at that the commit was supposed 
> to be an NFC.
Could you please confirm @Endilll?

I'll leave to @AaronBallman to decide whether this is a functional change, but 
I can confirm that patch is working as intended, because there is an implicit 
initialization here `const auto *p = new Derived;`, because `Derived` is a 
class type.

https://github.com/llvm/llvm-project/pull/71417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Warning for incorrect useof 'pure' attribute (PR #78200)

2024-01-18 Thread via cfe-commits


https://github.com/kelbon updated 
https://github.com/llvm/llvm-project/pull/78200

>From fb05243d0c0c3702b1615239a9337df337ad0c7c Mon Sep 17 00:00:00 2001
From: Kelbon Nik 
Date: Mon, 15 Jan 2024 22:24:34 +0400
Subject: [PATCH 1/7] add warning and test

---
 clang/include/clang/Basic/DiagnosticGroups.td| 1 +
 clang/include/clang/Basic/DiagnosticSemaKinds.td | 7 +++
 clang/lib/Sema/SemaDecl.cpp  | 7 +++
 clang/test/Sema/incorrect_pure.cpp   | 7 +++
 4 files changed, 22 insertions(+)
 create mode 100644 clang/test/Sema/incorrect_pure.cpp

diff --git a/clang/include/clang/Basic/DiagnosticGroups.td 
b/clang/include/clang/Basic/DiagnosticGroups.td
index 6765721ae7002c..9fcf2be2e45458 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -414,6 +414,7 @@ def : DiagGroup<"c++2a-compat", [CXX20Compat]>;
 def : DiagGroup<"c++2a-compat-pedantic", [CXX20CompatPedantic]>;
 
 def ExitTimeDestructors : DiagGroup<"exit-time-destructors">;
+def IncorrectAttributeUsage : DiagGroup<"incorrect-attribute-usage">;
 def FlexibleArrayExtensions : DiagGroup<"flexible-array-extensions">;
 def FourByteMultiChar : DiagGroup<"four-char-constants">;
 def GlobalConstructors : DiagGroup<"global-constructors"> {
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td 
b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 03b0122d1c08f7..1df075119a482f 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -692,6 +692,13 @@ def warn_maybe_falloff_nonvoid_function : Warning<
 def warn_falloff_nonvoid_function : Warning<
   "non-void function does not return a value">,
   InGroup;
+def warn_pure_attr_on_cxx_constructor : Warning<
+  "constructor cannot be 'pure' (undefined behavior)">,
+  InGroup;
+def warn_pure_function_returns_void : Warning<
+  "'pure' attribute on function returning 'void'">,
+  InGroup;
+
 def err_maybe_falloff_nonvoid_block : Error<
   "non-void block does not return a value in all control paths">;
 def err_falloff_nonvoid_block : Error<
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 5472b43aafd4f3..69ce7c50764fac 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -11898,6 +11898,13 @@ bool Sema::CheckFunctionDeclaration(Scope *S, 
FunctionDecl *NewFD,
 NewFD->setInvalidDecl();
   }
 
+  if (NewFD->hasAttr() || NewFD->hasAttr()) {
+if (isa(NewFD))
+  Diag(NewFD->getLocation(), diag::warn_pure_attr_on_cxx_constructor);
+else if (NewFD->getReturnType()->isVoidType())
+  Diag(NewFD->getLocation(), diag::warn_pure_function_returns_void);
+  }
+
   // C++11 [dcl.constexpr]p8:
   //   A constexpr specifier for a non-static member function that is not
   //   a constructor declares that member function to be const.
diff --git a/clang/test/Sema/incorrect_pure.cpp 
b/clang/test/Sema/incorrect_pure.cpp
new file mode 100644
index 00..ce02309f086386
--- /dev/null
+++ b/clang/test/Sema/incorrect_pure.cpp
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+
+[[gnu::pure]] void foo(); // expected-warning{{'pure' attribute on function 
returning 'void'}}
+
+struct A {
+[[gnu::pure]] A(); // expected-warning{{constructor cannot be 'pure' 
(undefined behavior)}}
+};

>From e89f96c76d730a3cd9e1647837366a38f88118a3 Mon Sep 17 00:00:00 2001
From: Kelbon Nik 
Date: Tue, 16 Jan 2024 00:03:47 +0400
Subject: [PATCH 2/7] fix old incorrect test

---
 clang/test/Analysis/call-invalidation.cpp   | 8 
 clang/test/CodeGen/pragma-weak.c| 6 +++---
 clang/test/Interpreter/disambiguate-decl-stmt.cpp   | 4 ++--
 clang/test/SemaCXX/cxx0x-cursory-default-delete.cpp | 2 +-
 clang/test/SemaCXX/warn-unused-value-cxx11.cpp  | 2 +-
 5 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/clang/test/Analysis/call-invalidation.cpp 
b/clang/test/Analysis/call-invalidation.cpp
index ef6505e19cf803..727217f228b054 100644
--- a/clang/test/Analysis/call-invalidation.cpp
+++ b/clang/test/Analysis/call-invalidation.cpp
@@ -90,8 +90,8 @@ void testConstReferenceStruct() {
 }
 
 
-void usePointerPure(int * const *) __attribute__((pure));
-void usePointerConst(int * const *) __attribute__((const));
+int usePointerPure(int * const *) __attribute__((pure));
+int usePointerConst(int * const *) __attribute__((const));
 
 void testPureConst() {
   extern int global;
@@ -104,11 +104,11 @@ void testPureConst() {
   clang_analyzer_eval(x == 42); // expected-warning{{TRUE}}
   clang_analyzer_eval(global == -5); // expected-warning{{TRUE}}
 
-  usePointerPure(&p);
+  (void)usePointerPure(&p);
   clang_analyzer_eval(x == 42); // expected-warning{{TRUE}}
   clang_analyzer_eval(global == -5); // expected-warning{{TRUE}}
 
-  usePointerConst(&p);
+  (void)usePointerConst(&p);
   clang_analyzer_eval(x == 42); // expected-warning{{TRUE}}
   clang_analyzer_eva

[flang] [clang-tools-extra] [compiler-rt] [llvm] [libc] [clang] [flang] use setsid to assign the child to prevent zombie as it will be clean up by init process (PR #77944)

2024-01-18 Thread Yi Wu via cfe-commits


https://github.com/yi-wu-arm updated 
https://github.com/llvm/llvm-project/pull/77944

>From b51f293d57a1ae96fab5d3b2a529186a78643c8c Mon Sep 17 00:00:00 2001
From: Yi Wu 
Date: Fri, 12 Jan 2024 16:44:21 +
Subject: [PATCH 1/3] use setsid to assign the child to prevent zombie as it
 will be clean up by init process

---
 flang/runtime/execute.cpp | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/flang/runtime/execute.cpp b/flang/runtime/execute.cpp
index 48773ae8114b0b..1bd5bb81ec8461 100644
--- a/flang/runtime/execute.cpp
+++ b/flang/runtime/execute.cpp
@@ -180,8 +180,6 @@ void RTNAME(ExecuteCommandLine)(const Descriptor &command, 
bool wait,
 }
 FreeMemory((void *)wcmd);
 #else
-// terminated children do not become zombies
-signal(SIGCHLD, SIG_IGN);
 pid_t pid{fork()};
 if (pid < 0) {
   if (!cmdstat) {
@@ -191,6 +189,18 @@ void RTNAME(ExecuteCommandLine)(const Descriptor &command, 
bool wait,
 CheckAndCopyCharsToDescriptor(cmdmsg, "Fork failed");
   }
 } else if (pid == 0) {
+  if (setsid() == -1) {
+if (!cmdstat) {
+  terminator.Crash(
+  "setsid() failed with errno: %d, asynchronous process initiation 
failed.",
+  errno);
+} else {
+  StoreIntToDescriptor(cmdstat, ASYNC_NO_SUPPORT_ERR, terminator);
+  CheckAndCopyCharsToDescriptor(
+  cmdmsg, "setsid() failed, asynchronous process initiation 
failed.");
+}
+exit(EXIT_FAILURE);
+  }
   int status{std::system(newCmd)};
   TerminationCheck(status, cmdstat, cmdmsg, terminator);
   exit(status);

>From 9682eb49bcd77e70439accd2eaa4524fea5cdfe5 Mon Sep 17 00:00:00 2001
From: Yi Wu 
Date: Fri, 12 Jan 2024 16:58:29 +
Subject: [PATCH 2/3] clang-format

---
 flang/runtime/execute.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/flang/runtime/execute.cpp b/flang/runtime/execute.cpp
index 1bd5bb81ec8461..d149b5d47ef754 100644
--- a/flang/runtime/execute.cpp
+++ b/flang/runtime/execute.cpp
@@ -191,13 +191,13 @@ void RTNAME(ExecuteCommandLine)(const Descriptor 
&command, bool wait,
 } else if (pid == 0) {
   if (setsid() == -1) {
 if (!cmdstat) {
-  terminator.Crash(
-  "setsid() failed with errno: %d, asynchronous process initiation 
failed.",
+  terminator.Crash("setsid() failed with errno: %d, asynchronous "
+   "process initiation failed.",
   errno);
 } else {
   StoreIntToDescriptor(cmdstat, ASYNC_NO_SUPPORT_ERR, terminator);
-  CheckAndCopyCharsToDescriptor(
-  cmdmsg, "setsid() failed, asynchronous process initiation 
failed.");
+  CheckAndCopyCharsToDescriptor(cmdmsg,
+  "setsid() failed, asynchronous process initiation failed.");
 }
 exit(EXIT_FAILURE);
   }

>From b8f4db41db6ceb10897f113243d4a0954d727dc7 Mon Sep 17 00:00:00 2001
From: Yi Wu 
Date: Fri, 12 Jan 2024 17:01:48 +
Subject: [PATCH 3/3] add comment

---
 flang/runtime/execute.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/flang/runtime/execute.cpp b/flang/runtime/execute.cpp
index d149b5d47ef754..f455cf8b0e88ca 100644
--- a/flang/runtime/execute.cpp
+++ b/flang/runtime/execute.cpp
@@ -189,6 +189,7 @@ void RTNAME(ExecuteCommandLine)(const Descriptor &command, 
bool wait,
 CheckAndCopyCharsToDescriptor(cmdmsg, "Fork failed");
   }
 } else if (pid == 0) {
+  // Create a new session, let init process take care of zombie child
   if (setsid() == -1) {
 if (!cmdstat) {
   terminator.Crash("setsid() failed with errno: %d, asynchronous "

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [clang][NFC] Refactor `CXXNewExpr::InitializationStyle` (re-land) (PR #71417)

2024-01-18 Thread via cfe-commits


cor3ntin wrote:

@steakhal Thanks for raising this. 

I agree this stretches the definition of NFC commit.
But it was dully reviewed and approved 
https://github.com/llvm/llvm-project/pull/71322 

We usually do not make any guarantees as the stability of the C++ interfaces, 
so as this test did not exist when this change was committed, I am not sure 
anything was actually broken.

Please let us know if this change is disruptive to you though, thanks!





https://github.com/llvm/llvm-project/pull/71417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Warning for incorrect useof 'pure' attribute (PR #78200)

2024-01-18 Thread via cfe-commits


https://github.com/kelbon updated 
https://github.com/llvm/llvm-project/pull/78200

>From fb05243d0c0c3702b1615239a9337df337ad0c7c Mon Sep 17 00:00:00 2001
From: Kelbon Nik 
Date: Mon, 15 Jan 2024 22:24:34 +0400
Subject: [PATCH 1/8] add warning and test

---
 clang/include/clang/Basic/DiagnosticGroups.td| 1 +
 clang/include/clang/Basic/DiagnosticSemaKinds.td | 7 +++
 clang/lib/Sema/SemaDecl.cpp  | 7 +++
 clang/test/Sema/incorrect_pure.cpp   | 7 +++
 4 files changed, 22 insertions(+)
 create mode 100644 clang/test/Sema/incorrect_pure.cpp

diff --git a/clang/include/clang/Basic/DiagnosticGroups.td 
b/clang/include/clang/Basic/DiagnosticGroups.td
index 6765721ae7002c..9fcf2be2e45458 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -414,6 +414,7 @@ def : DiagGroup<"c++2a-compat", [CXX20Compat]>;
 def : DiagGroup<"c++2a-compat-pedantic", [CXX20CompatPedantic]>;
 
 def ExitTimeDestructors : DiagGroup<"exit-time-destructors">;
+def IncorrectAttributeUsage : DiagGroup<"incorrect-attribute-usage">;
 def FlexibleArrayExtensions : DiagGroup<"flexible-array-extensions">;
 def FourByteMultiChar : DiagGroup<"four-char-constants">;
 def GlobalConstructors : DiagGroup<"global-constructors"> {
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td 
b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 03b0122d1c08f7..1df075119a482f 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -692,6 +692,13 @@ def warn_maybe_falloff_nonvoid_function : Warning<
 def warn_falloff_nonvoid_function : Warning<
   "non-void function does not return a value">,
   InGroup;
+def warn_pure_attr_on_cxx_constructor : Warning<
+  "constructor cannot be 'pure' (undefined behavior)">,
+  InGroup;
+def warn_pure_function_returns_void : Warning<
+  "'pure' attribute on function returning 'void'">,
+  InGroup;
+
 def err_maybe_falloff_nonvoid_block : Error<
   "non-void block does not return a value in all control paths">;
 def err_falloff_nonvoid_block : Error<
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 5472b43aafd4f3..69ce7c50764fac 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -11898,6 +11898,13 @@ bool Sema::CheckFunctionDeclaration(Scope *S, 
FunctionDecl *NewFD,
 NewFD->setInvalidDecl();
   }
 
+  if (NewFD->hasAttr() || NewFD->hasAttr()) {
+if (isa(NewFD))
+  Diag(NewFD->getLocation(), diag::warn_pure_attr_on_cxx_constructor);
+else if (NewFD->getReturnType()->isVoidType())
+  Diag(NewFD->getLocation(), diag::warn_pure_function_returns_void);
+  }
+
   // C++11 [dcl.constexpr]p8:
   //   A constexpr specifier for a non-static member function that is not
   //   a constructor declares that member function to be const.
diff --git a/clang/test/Sema/incorrect_pure.cpp 
b/clang/test/Sema/incorrect_pure.cpp
new file mode 100644
index 00..ce02309f086386
--- /dev/null
+++ b/clang/test/Sema/incorrect_pure.cpp
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s
+
+[[gnu::pure]] void foo(); // expected-warning{{'pure' attribute on function 
returning 'void'}}
+
+struct A {
+[[gnu::pure]] A(); // expected-warning{{constructor cannot be 'pure' 
(undefined behavior)}}
+};

>From e89f96c76d730a3cd9e1647837366a38f88118a3 Mon Sep 17 00:00:00 2001
From: Kelbon Nik 
Date: Tue, 16 Jan 2024 00:03:47 +0400
Subject: [PATCH 2/8] fix old incorrect test

---
 clang/test/Analysis/call-invalidation.cpp   | 8 
 clang/test/CodeGen/pragma-weak.c| 6 +++---
 clang/test/Interpreter/disambiguate-decl-stmt.cpp   | 4 ++--
 clang/test/SemaCXX/cxx0x-cursory-default-delete.cpp | 2 +-
 clang/test/SemaCXX/warn-unused-value-cxx11.cpp  | 2 +-
 5 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/clang/test/Analysis/call-invalidation.cpp 
b/clang/test/Analysis/call-invalidation.cpp
index ef6505e19cf803..727217f228b054 100644
--- a/clang/test/Analysis/call-invalidation.cpp
+++ b/clang/test/Analysis/call-invalidation.cpp
@@ -90,8 +90,8 @@ void testConstReferenceStruct() {
 }
 
 
-void usePointerPure(int * const *) __attribute__((pure));
-void usePointerConst(int * const *) __attribute__((const));
+int usePointerPure(int * const *) __attribute__((pure));
+int usePointerConst(int * const *) __attribute__((const));
 
 void testPureConst() {
   extern int global;
@@ -104,11 +104,11 @@ void testPureConst() {
   clang_analyzer_eval(x == 42); // expected-warning{{TRUE}}
   clang_analyzer_eval(global == -5); // expected-warning{{TRUE}}
 
-  usePointerPure(&p);
+  (void)usePointerPure(&p);
   clang_analyzer_eval(x == 42); // expected-warning{{TRUE}}
   clang_analyzer_eval(global == -5); // expected-warning{{TRUE}}
 
-  usePointerConst(&p);
+  (void)usePointerConst(&p);
   clang_analyzer_eval(x == 42); // expected-warning{{TRUE}}
   clang_analyzer_eva

[clang] [Clang][Sema] fix crash of attribute transform (PR #78088)

2024-01-18 Thread Qizhi Hu via cfe-commits



@@ -7081,10 +7085,10 @@ QualType 
TreeTransform::TransformAttributedType(
   // FIXME: dependent operand expressions?
   if (getDerived().AlwaysRebuild() ||
   modifiedType != oldType->getModifiedType()) {
-// TODO: this is really lame; we should really be rebuilding the
-// equivalent type from first principles.
-QualType equivalentType
-  = getDerived().TransformType(oldType->getEquivalentType());
+TypeLocBuilder AuxiliaryTLB;
+AuxiliaryTLB.reserve(TL.getFullDataSize());

jcsxky wrote:

Thanks for your remind! Fixed.

https://github.com/llvm/llvm-project/pull/78088
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [clang][NFC] Refactor `CXXNewExpr::InitializationStyle` (re-land) (PR #71417)

2024-01-18 Thread Vlad Serebrennikov via cfe-commits


Endilll wrote:

> I agree this stretches the definition of NFC commit.
But it was dully reviewed and approved 
https://github.com/llvm/llvm-project/pull/71322

I agree with this assessment. I think it really started as regular NFC, but 
then me and Aaron realized that we can get rid of some ugly code if we properly 
model implicit initialization.

https://github.com/llvm/llvm-project/pull/71417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[flang] [clang] [llvm] [lld] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits



@@ -840,6 +845,12 @@ enum : unsigned {
   EF_AMDGPU_FEATURE_SRAMECC_OFF_V4 = 0x800,
   // SRAMECC is on.
   EF_AMDGPU_FEATURE_SRAMECC_ON_V4 = 0xc00,
+
+  // Generic target versioning. This is contained in the list byte of EFLAGS.

Pierre-vh wrote:

It's already part of #76954, I just haven't figured out how to stack PR yet so 
all changes of #76954 are here too :/

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[flang] [clang] [llvm] [lld] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-01-18 Thread Pierre van Houtryve via cfe-commits


https://github.com/Pierre-vh updated 
https://github.com/llvm/llvm-project/pull/76954

>From d56e752e3eed0fd75a7ff98638ec71635019fdb1 Mon Sep 17 00:00:00 2001
From: pvanhout 
Date: Thu, 4 Jan 2024 14:12:00 +0100
Subject: [PATCH] [AMDGPU] Introduce Code Object V6

Introduce Code Object V6 in Clang, LLD, Flang and LLVM.
This is the same as V5 except a new "generic version" flag can be present in 
EFLAGS. This is related to new generic targets that'll be added in a follow-up 
patch. It's also likely V6 will have new changes (possibly new metadata 
entries) added later.

Docs change are not included, I'm planning to do them in a follow-up patch all 
at once (when generic targets land too).
---
 clang/include/clang/Driver/Options.td |   4 +-
 clang/lib/CodeGen/CGBuiltin.cpp   |   6 +-
 clang/lib/Driver/ToolChains/CommonArgs.cpp|   2 +-
 .../amdgpu-code-object-version-linking.cu |  37 +++
 .../CodeGenCUDA/amdgpu-code-object-version.cu |   4 +
 .../test/CodeGenCUDA/amdgpu-workgroup-size.cu |   4 +
 .../amdgcn/bitcode/oclc_abi_version_600.bc|   0
 clang/test/Driver/hip-code-object-version.hip |  12 +
 clang/test/Driver/hip-device-libs.hip |  18 +-
 flang/lib/Frontend/CompilerInvocation.cpp |   2 +
 flang/test/Lower/AMD/code-object-version.f90  |   3 +-
 lld/ELF/Arch/AMDGPU.cpp   |  21 ++
 lld/test/ELF/amdgpu-tid.s |  16 ++
 llvm/include/llvm/BinaryFormat/ELF.h  |   9 +-
 llvm/include/llvm/Support/AMDGPUMetadata.h|   5 +
 llvm/include/llvm/Support/ScopedPrinter.h |   4 +-
 llvm/include/llvm/Target/TargetOptions.h  |   1 +
 llvm/lib/ObjectYAML/ELFYAML.cpp   |   9 +
 llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp   |   3 +
 .../AMDGPU/AMDGPUHSAMetadataStreamer.cpp  |  10 +
 .../Target/AMDGPU/AMDGPUHSAMetadataStreamer.h |  11 +-
 .../MCTargetDesc/AMDGPUTargetStreamer.cpp |  27 +++
 .../MCTargetDesc/AMDGPUTargetStreamer.h   |   1 +
 .../Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp|  13 +
 llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h |   5 +-
 ...licit-kernarg-backend-usage-global-isel.ll |   2 +
 .../AMDGPU/call-graph-register-usage.ll   |   1 +
 .../AMDGPU/codegen-internal-only-func.ll  |   2 +
 llvm/test/CodeGen/AMDGPU/elf-header-osabi.ll  |   4 +
 .../enable-scratch-only-dynamic-stack.ll  |   1 +
 .../AMDGPU/implicit-kernarg-backend-usage.ll  |   2 +
 .../AMDGPU/implicitarg-offset-attributes.ll   |  46 
 .../AMDGPU/llvm.amdgcn.implicitarg.ptr.ll |   1 +
 llvm/test/CodeGen/AMDGPU/non-entry-alloca.ll  |   1 +
 llvm/test/CodeGen/AMDGPU/recursion.ll |   1 +
 .../AMDGPU/resource-usage-dead-function.ll|   1 +
 .../AMDGPU/tid-mul-func-xnack-all-any.ll  |   6 +
 .../tid-mul-func-xnack-all-not-supported.ll   |   6 +
 .../AMDGPU/tid-mul-func-xnack-all-off.ll  |   6 +
 .../AMDGPU/tid-mul-func-xnack-all-on.ll   |   6 +
 .../AMDGPU/tid-mul-func-xnack-any-off-1.ll|   6 +
 .../AMDGPU/tid-mul-func-xnack-any-off-2.ll|   6 +
 .../AMDGPU/tid-mul-func-xnack-any-on-1.ll |   6 +
 .../AMDGPU/tid-mul-func-xnack-any-on-2.ll |   6 +
 .../tid-one-func-xnack-not-supported.ll   |   6 +
 .../CodeGen/AMDGPU/tid-one-func-xnack-off.ll  |   6 +
 .../CodeGen/AMDGPU/tid-one-func-xnack-on.ll   |   6 +
 .../MC/AMDGPU/hsa-v5-uses-dynamic-stack.s |   5 +
 .../elf-headers.test} |   0
 .../ELF/AMDGPU/generic_versions.s |  16 ++
 .../ELF/AMDGPU/generic_versions.test  |  26 ++
 llvm/tools/llvm-readobj/ELFDumper.cpp | 224 --
 52 files changed, 491 insertions(+), 135 deletions(-)
 create mode 100644 
clang/test/Driver/Inputs/rocm/amdgcn/bitcode/oclc_abi_version_600.bc
 rename llvm/test/tools/llvm-readobj/ELF/{amdgpu-elf-headers.test => 
AMDGPU/elf-headers.test} (100%)
 create mode 100644 llvm/test/tools/llvm-readobj/ELF/AMDGPU/generic_versions.s
 create mode 100644 
llvm/test/tools/llvm-readobj/ELF/AMDGPU/generic_versions.test

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index e4fdad8265c8637..a6b96ea027056e3 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4763,9 +4763,9 @@ defm amdgpu_ieee : BoolOption<"m", "amdgpu-ieee",
 def mcode_object_version_EQ : Joined<["-"], "mcode-object-version=">, 
Group,
   HelpText<"Specify code object ABI version. Defaults to 4. (AMDGPU only)">,
   Visibility<[ClangOption, FlangOption, CC1Option, FC1Option]>,
-  Values<"none,4,5">,
+  Values<"none,4,5,6">,
   NormalizedValuesScope<"llvm::CodeObjectVersionKind">,
-  NormalizedValues<["COV_None", "COV_4", "COV_5"]>,
+  NormalizedValues<["COV_None", "COV_4", "COV_5", "COV_6"]>,
   MarshallingInfoEnum, "COV_4">;
 
 defm cumode : SimpleMFlag<"cumode",
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index f4246c5e8f68e8b..16dbb4bd835df53 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/Code

[clang] [Clang][SME] Add missing IsStreamingCompatible flag to svget, svcreate & svset (PR #78430)

2024-01-18 Thread Kerry McLaughlin via cfe-commits


https://github.com/kmclaughlin-arm closed 
https://github.com/llvm/llvm-project/pull/78430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Clang][SME] Detect always_inline used with mismatched streaming attributes (PR #77936)

2024-01-18 Thread Sam Tebbs via cfe-commits


https://github.com/SamTebbs33 updated 
https://github.com/llvm/llvm-project/pull/77936

>From bbc6c11cd3def5acbb2ba2f2ddc45df2c399f9d6 Mon Sep 17 00:00:00 2001
From: Samuel Tebbs 
Date: Wed, 10 Jan 2024 14:57:04 +
Subject: [PATCH 1/6] [Clang][SME] Detect always_inline used with mismatched
 streaming attributes

This patch adds an error that is emitted when a streaming function is
marked as always_inline and is called from a non-streaming function.
---
 .../clang/Basic/DiagnosticFrontendKinds.td|  2 ++
 clang/include/clang/Sema/Sema.h   |  9 +++
 clang/lib/CodeGen/CMakeLists.txt  |  1 +
 clang/lib/CodeGen/Targets/AArch64.cpp | 20 ++
 clang/lib/Sema/SemaChecking.cpp   | 27 +++
 ...-sme-func-attrs-inline-locally-streaming.c | 12 +
 .../aarch64-sme-func-attrs-inline-streaming.c | 12 +
 7 files changed, 66 insertions(+), 17 deletions(-)
 create mode 100644 
clang/test/CodeGen/aarch64-sme-func-attrs-inline-locally-streaming.c
 create mode 100644 clang/test/CodeGen/aarch64-sme-func-attrs-inline-streaming.c

diff --git a/clang/include/clang/Basic/DiagnosticFrontendKinds.td 
b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
index 568000106a84dc..dbd92b600a936e 100644
--- a/clang/include/clang/Basic/DiagnosticFrontendKinds.td
+++ b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
@@ -279,6 +279,8 @@ def err_builtin_needs_feature : Error<"%0 needs target 
feature %1">;
 def err_function_needs_feature : Error<
   "always_inline function %1 requires target feature '%2', but would "
   "be inlined into function %0 that is compiled without support for '%2'">;
+def err_function_alwaysinline_attribute_mismatch : Error<
+  "always_inline function %1 and its caller %0 have mismatched %2 attributes">;
 
 def warn_avx_calling_convention
 : Warning<"AVX vector %select{return|argument}0 of type %1 without '%2' "
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 4c464a1ae4c67f..0fed60103c9a2c 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -13803,8 +13803,17 @@ class Sema final {
 FormatArgumentPassingKind ArgPassingKind;
   };
 
+enum ArmStreamingType {
+  ArmNonStreaming,
+  ArmStreaming,
+  ArmStreamingCompatible,
+  ArmStreamingOrSVE2p1
+};
+
+
   static bool getFormatStringInfo(const FormatAttr *Format, bool IsCXXMember,
   bool IsVariadic, FormatStringInfo *FSI);
+  static ArmStreamingType getArmStreamingFnType(const FunctionDecl *FD);
 
 private:
   void CheckArrayAccess(const Expr *BaseExpr, const Expr *IndexExpr,
diff --git a/clang/lib/CodeGen/CMakeLists.txt b/clang/lib/CodeGen/CMakeLists.txt
index 52216d93a302bb..03a6f2f1d7a9d2 100644
--- a/clang/lib/CodeGen/CMakeLists.txt
+++ b/clang/lib/CodeGen/CMakeLists.txt
@@ -151,4 +151,5 @@ add_clang_library(clangCodeGen
   clangFrontend
   clangLex
   clangSerialization
+  clangSema
   )
diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp 
b/clang/lib/CodeGen/Targets/AArch64.cpp
index 7102d190fe008b..ea3d5a97605f1c 100644
--- a/clang/lib/CodeGen/Targets/AArch64.cpp
+++ b/clang/lib/CodeGen/Targets/AArch64.cpp
@@ -8,6 +8,8 @@
 
 #include "ABIInfoImpl.h"
 #include "TargetInfo.h"
+#include "clang/Basic/DiagnosticFrontend.h"
+#include "clang/Sema/Sema.h"
 
 using namespace clang;
 using namespace clang::CodeGen;
@@ -153,6 +155,11 @@ class AArch64TargetCodeGenInfo : public TargetCodeGenInfo {
 }
 return TargetCodeGenInfo::isScalarizableAsmOperand(CGF, Ty);
   }
+
+  void checkFunctionCallABI(CodeGenModule &CGM, SourceLocation CallLoc,
+const FunctionDecl *Caller,
+const FunctionDecl *Callee,
+const CallArgList &Args) const override;
 };
 
 class WindowsAArch64TargetCodeGenInfo : public AArch64TargetCodeGenInfo {
@@ -812,6 +819,19 @@ Address AArch64ABIInfo::EmitMSVAArg(CodeGenFunction &CGF, 
Address VAListAddr,
   /*allowHigherAlign*/ false);
 }
 
+void AArch64TargetCodeGenInfo::checkFunctionCallABI(
+CodeGenModule &CGM, SourceLocation CallLoc, const FunctionDecl *Caller,
+const FunctionDecl *Callee, const CallArgList &Args) const {
+if (!Callee->hasAttr())
+  return;
+
+auto CalleeIsStreaming = Sema::getArmStreamingFnType(Callee) == 
Sema::ArmStreaming;
+auto CallerIsStreaming = Sema::getArmStreamingFnType(Caller) == 
Sema::ArmStreaming;
+
+if (CalleeIsStreaming && !CallerIsStreaming)
+CGM.getDiags().Report(CallLoc, 
diag::err_function_alwaysinline_attribute_mismatch) << Caller->getDeclName() << 
Callee->getDeclName() << "streaming";
+}
+
 std::unique_ptr
 CodeGen::createAArch64TargetCodeGenInfo(CodeGenModule &CGM,
 AArch64ABIKind Kind) {
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 74f8f626fb1637..160637dde448e4 100644
--- a/clang/lib/Sem

[clang] [clang][Diagnostics] Highlight code snippets (PR #66514)

2024-01-18 Thread Timm Baeder via cfe-commits

Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= ,
Timm =?utf-8?q?Bäder?= 
Message-ID:
In-Reply-To: 


tbaederr wrote:

> > > Here are the results for a quick implementation of check points: 
> > > http://llvm-compile-time-tracker.com/compare.php?from=12e425d0cf9bca072c7b2138e50acbc5f1cd818c&to=99f3a7853f9fa83bffe3b4d04e41e744169d426a&stat=instructions:u
> > > With a little more fiddling: 
> > > http://llvm-compile-time-tracker.com/compare.php?from=12e425d0cf9bca072c7b2138e50acbc5f1cd818c&to=0ee6dd17747818b05a1d504e4916ce46ef061226&stat=instructions:u
> > 
> > 
> > Imo this is reasonable, we should go in that direction. It's not free but 
> > it is predictable
> > Further improvements:
> > * cache the lookup of `CheckPoints[FID]` as it should not change between 
> > calls to `BeginSourceFile`
> > * Play with reserve and/or dequeue
> 
> Yeah, I like the direction it heads with checkpoints. The implementation is 
> pretty straight-forward, the performance is pretty consistent (and doesn't 
> seem to add significant overhead). Just for comparison though, how does the 
> stress test from the re-lexing approach perform with checkpoints? I expect 
> we'll still see an increase in compile times but hopefully not "took more 
> than an hour" levels of increase.

Before any of the changes in this PR:
```
real0m9.909s
user0m0.934s
sys 0m8.945s
```

with highlighting:
```
real1m3.245s
user0m53.403s
sys 0m9.682s
```

with the changes, but no highlighting, i.e. only the preprocessor changes:
```
real0m10.032s
user0m0.937s
sys 0m9.064s
```



https://github.com/llvm/llvm-project/pull/66514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [lld] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits



@@ -787,11 +788,15 @@ enum : unsigned {
   EF_AMDGPU_MACH_AMDGCN_GFX942= 0x04c,
   EF_AMDGPU_MACH_AMDGCN_RESERVED_0X4D = 0x04d,
   EF_AMDGPU_MACH_AMDGCN_GFX1201   = 0x04e,
+  EF_AMDGPU_MACH_AMDGCN_GFX9_GENERIC  = 0x04f,
+  EF_AMDGPU_MACH_AMDGCN_GFX10_1_GENERIC   = 0x050,

Pierre-vh wrote:

172dbdf9312a15b449954e43623afc28240f50dd

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [lld] [clang] [flang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits



@@ -787,11 +788,15 @@ enum : unsigned {
   EF_AMDGPU_MACH_AMDGCN_GFX942= 0x04c,
   EF_AMDGPU_MACH_AMDGCN_RESERVED_0X4D = 0x04d,
   EF_AMDGPU_MACH_AMDGCN_GFX1201   = 0x04e,
+  EF_AMDGPU_MACH_AMDGCN_GFX9_GENERIC  = 0x04f,

Pierre-vh wrote:

172dbdf9312a15b449954e43623afc28240f50dd

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [lld] [llvm] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-01-18 Thread Pierre van Houtryve via cfe-commits


https://github.com/Pierre-vh updated 
https://github.com/llvm/llvm-project/pull/76954

>From 47d4f3ed4e27f2ce2b3b33c9b0ca4838b3011f22 Mon Sep 17 00:00:00 2001
From: pvanhout 
Date: Thu, 4 Jan 2024 14:12:00 +0100
Subject: [PATCH] [AMDGPU] Introduce Code Object V6

Introduce Code Object V6 in Clang, LLD, Flang and LLVM.
This is the same as V5 except a new "generic version" flag can be present in 
EFLAGS. This is related to new generic targets that'll be added in a follow-up 
patch. It's also likely V6 will have new changes (possibly new metadata 
entries) added later.

Docs change are not included, I'm planning to do them in a follow-up patch all 
at once (when generic targets land too).
---
 clang/include/clang/Driver/Options.td |   4 +-
 clang/lib/CodeGen/CGBuiltin.cpp   |   6 +-
 clang/lib/Driver/ToolChains/CommonArgs.cpp|   2 +-
 .../amdgpu-code-object-version-linking.cu |  37 +++
 .../CodeGenCUDA/amdgpu-code-object-version.cu |   4 +
 .../test/CodeGenCUDA/amdgpu-workgroup-size.cu |   4 +
 .../amdgcn/bitcode/oclc_abi_version_600.bc|   0
 clang/test/Driver/hip-code-object-version.hip |  12 +
 clang/test/Driver/hip-device-libs.hip |  18 +-
 flang/lib/Frontend/CompilerInvocation.cpp |   2 +
 flang/test/Lower/AMD/code-object-version.f90  |   3 +-
 lld/ELF/Arch/AMDGPU.cpp   |  21 ++
 lld/test/ELF/amdgpu-tid.s |  16 ++
 llvm/include/llvm/BinaryFormat/ELF.h  |   9 +-
 llvm/include/llvm/Support/AMDGPUMetadata.h|   5 +
 llvm/include/llvm/Support/ScopedPrinter.h |   4 +-
 llvm/include/llvm/Target/TargetOptions.h  |   1 +
 llvm/lib/ObjectYAML/ELFYAML.cpp   |   9 +
 llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp   |   3 +
 .../AMDGPU/AMDGPUHSAMetadataStreamer.cpp  |  10 +
 .../Target/AMDGPU/AMDGPUHSAMetadataStreamer.h |  11 +-
 .../MCTargetDesc/AMDGPUTargetStreamer.cpp |  27 +++
 .../MCTargetDesc/AMDGPUTargetStreamer.h   |   1 +
 .../Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp|  13 +
 llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h |   5 +-
 ...licit-kernarg-backend-usage-global-isel.ll |   2 +
 .../AMDGPU/call-graph-register-usage.ll   |   1 +
 .../AMDGPU/codegen-internal-only-func.ll  |   2 +
 llvm/test/CodeGen/AMDGPU/elf-header-osabi.ll  |   4 +
 .../enable-scratch-only-dynamic-stack.ll  |   1 +
 .../AMDGPU/implicit-kernarg-backend-usage.ll  |   2 +
 .../AMDGPU/implicitarg-offset-attributes.ll   |  46 
 .../AMDGPU/llvm.amdgcn.implicitarg.ptr.ll |   1 +
 llvm/test/CodeGen/AMDGPU/non-entry-alloca.ll  |   1 +
 llvm/test/CodeGen/AMDGPU/recursion.ll |   1 +
 .../AMDGPU/resource-usage-dead-function.ll|   1 +
 .../AMDGPU/tid-mul-func-xnack-all-any.ll  |   6 +
 .../tid-mul-func-xnack-all-not-supported.ll   |   6 +
 .../AMDGPU/tid-mul-func-xnack-all-off.ll  |   6 +
 .../AMDGPU/tid-mul-func-xnack-all-on.ll   |   6 +
 .../AMDGPU/tid-mul-func-xnack-any-off-1.ll|   6 +
 .../AMDGPU/tid-mul-func-xnack-any-off-2.ll|   6 +
 .../AMDGPU/tid-mul-func-xnack-any-on-1.ll |   6 +
 .../AMDGPU/tid-mul-func-xnack-any-on-2.ll |   6 +
 .../tid-one-func-xnack-not-supported.ll   |   6 +
 .../CodeGen/AMDGPU/tid-one-func-xnack-off.ll  |   6 +
 .../CodeGen/AMDGPU/tid-one-func-xnack-on.ll   |   6 +
 .../MC/AMDGPU/hsa-v5-uses-dynamic-stack.s |   5 +
 .../elf-headers.test} |   0
 .../ELF/AMDGPU/generic_versions.s |  16 ++
 .../ELF/AMDGPU/generic_versions.test  |  26 ++
 llvm/tools/llvm-readobj/ELFDumper.cpp | 224 --
 52 files changed, 491 insertions(+), 135 deletions(-)
 create mode 100644 
clang/test/Driver/Inputs/rocm/amdgcn/bitcode/oclc_abi_version_600.bc
 rename llvm/test/tools/llvm-readobj/ELF/{amdgpu-elf-headers.test => 
AMDGPU/elf-headers.test} (100%)
 create mode 100644 llvm/test/tools/llvm-readobj/ELF/AMDGPU/generic_versions.s
 create mode 100644 
llvm/test/tools/llvm-readobj/ELF/AMDGPU/generic_versions.test

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index e4fdad8265c863..a6b96ea027056e 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4763,9 +4763,9 @@ defm amdgpu_ieee : BoolOption<"m", "amdgpu-ieee",
 def mcode_object_version_EQ : Joined<["-"], "mcode-object-version=">, 
Group,
   HelpText<"Specify code object ABI version. Defaults to 4. (AMDGPU only)">,
   Visibility<[ClangOption, FlangOption, CC1Option, FC1Option]>,
-  Values<"none,4,5">,
+  Values<"none,4,5,6">,
   NormalizedValuesScope<"llvm::CodeObjectVersionKind">,
-  NormalizedValues<["COV_None", "COV_4", "COV_5"]>,
+  NormalizedValues<["COV_None", "COV_4", "COV_5", "COV_6"]>,
   MarshallingInfoEnum, "COV_4">;
 
 defm cumode : SimpleMFlag<"cumode",
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index f4246c5e8f68e8..16dbb4bd835df5 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/

[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-18 Thread Bill Wendling via cfe-commits


bwendling wrote:

> We've discussed this before, years ago. Previously, the sticking point was: 
> how is LLVM going to know what the frontend considers the closest surrounding 
> subobject to be? The LLVM type doesn't give you that information, and it 
> seems like there's nowhere else that LLVM can get it from either, so this 
> flag ends up not being useful and the best we can do is to give a 
> conservatively-correct answer -- the complete object size if we want an upper 
> bound, or 0 if we want a lower bound.

Right now we're giving a wrong answer (see below), so that's no good. If we're 
going to err on the side of caution, then we should return the default "can't 
calculate the size" value.

When you say that we can't detect what the front-end considers the "closest 
surrounding subobject" to be, is that mostly due to corner cases or is it a 
more general concern? (Note, we're really only interested in supporting this 
for C structs. C++ structs / classes would require therapy.)

> Clang does respect the "subobject" flag if it can symbolically evaluate the 
> operand of `__builtin_object_size` sufficiently to determine which subobject 
> is being referenced. Previously we've thought that that was the best we could 
> do.

This is why the current behavior is wrong, in my opinion. The motivating 
example is below:

```
struct suspend_stats {
int success;
int fail;
int failed_freeze;
int failed_prepare;
int failed_suspend;
int failed_suspend_late;
int failed_suspend_noirq;
int failed_resume;
int failed_resume_early;
int failed_resume_noirq;
#define REC_FAILED_NUM  2
int last_failed_dev;
charfailed_devs[REC_FAILED_NUM][40]; /* offsetof(struct 
suspend_stats, failed_devs) == 44 */
int last_failed_errno;
int bar;
};

#define report(x) __builtin_printf(#x ": %zu\n", x)

int main(int argc, char *argv[])
{
struct suspend_stats foo;

report(sizeof(foo.failed_devs[1]));
report(sizeof(foo.failed_devs[argc]));
report(__builtin_dynamic_object_size(&foo.fail, 0));
report(__builtin_dynamic_object_size(&foo.fail, 1));
report(__builtin_dynamic_object_size(&foo.failed_freeze, 0));
report(__builtin_dynamic_object_size(foo.failed_devs[1], 0));
report(__builtin_dynamic_object_size(foo.failed_devs[1], 1));
report(__builtin_dynamic_object_size(foo.failed_devs[argc], 0));
report(__builtin_dynamic_object_size(foo.failed_devs[argc], 1));

return 0;
}
```

The output with this change is now:

```
__builtin_dynamic_object_size(&foo.fail, 0): 128
__builtin_dynamic_object_size(&foo.fail, 1): 4
__builtin_dynamic_object_size(&foo.failed_freeze, 0): 124
__builtin_dynamic_object_size(foo.failed_devs[1], 0): 48
__builtin_dynamic_object_size(foo.failed_devs[1], 1): 40
__builtin_dynamic_object_size(foo.failed_devs[argc], 0): 48
__builtin_dynamic_object_size(foo.failed_devs[argc], 1): 40
```

Without the change, the last line is:

```
__builtin_dynamic_object_size(foo.failed_devs[argc], 1): 48
```

Which isn't correct according to GNU's documentation. So if we can't honor the 
TYPE bit, then we should return `-1 / 0` here, right?

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-18 Thread Bill Wendling via cfe-commits


bwendling wrote:

> Using anything but the size and alignment of the alloca type in a way that 
> affects program semantics is illegal.

I'm sorry, but I don't understand your comment here. Could you supply some 
context?

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [lld] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits



@@ -787,11 +788,15 @@ enum : unsigned {
   EF_AMDGPU_MACH_AMDGCN_GFX942= 0x04c,
   EF_AMDGPU_MACH_AMDGCN_RESERVED_0X4D = 0x04d,
   EF_AMDGPU_MACH_AMDGCN_GFX1201   = 0x04e,
+  EF_AMDGPU_MACH_AMDGCN_GFX9_GENERIC  = 0x04f,
+  EF_AMDGPU_MACH_AMDGCN_GFX10_1_GENERIC   = 0x050,

Pierre-vh wrote:

Just noticed I forgot to update the AMDGPUUsage + the 
EF_AMDGPU_MACH_AMDGCN_LAST enum when adding the reserved entries. I'll do that 
here.

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[lldb] [clang] [compiler-rt] [flang] [lld] [llvm] [libcxx] [libc] [clang-tools-extra] AMDGPU/GFX12: Add new dot4 fp8/bf8 instructions (PR #77892)

2024-01-18 Thread Mariusz Sikora via cfe-commits


https://github.com/mariusz-sikora-at-amd updated 
https://github.com/llvm/llvm-project/pull/77892

>From 628a3d2b42cdcbd903e0830ab7d631ea7dc422b9 Mon Sep 17 00:00:00 2001
From: Petar Avramovic 
Date: Wed, 10 Jan 2024 12:17:58 +0100
Subject: [PATCH 1/2] AMDGPU/GFX12: Add new dot4 fp8/bf8 instructions

Endoding is VOP3P. Tagged as deep/machine learning instructions.
i32 type (v4fp8 or v4bf8 packed in i32) is used for src0 and src1.
src0 and src1 have no src_modifiers. src2 is f32 and has src_modifiers:
f32 fneg(neg_lo[2]) and f32 fabs(neg_hi[2]).
---
 clang/include/clang/Basic/BuiltinsAMDGPU.def  |   4 +
 .../builtins-amdgcn-dl-insts-err.cl   |   5 +
 .../builtins-amdgcn-dl-insts-gfx12.cl |  20 ++
 llvm/include/llvm/IR/IntrinsicsAMDGPU.td  |  19 ++
 .../Target/AMDGPU/AMDGPURegisterBankInfo.cpp  |   4 +
 .../AMDGPU/AsmParser/AMDGPUAsmParser.cpp  |  46 
 .../AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp |  17 +-
 llvm/lib/Target/AMDGPU/VOP3PInstructions.td   |  47 
 llvm/lib/Target/AMDGPU/VOPInstructions.td |  13 +-
 .../CodeGen/AMDGPU/llvm.amdgcn.fdot4.f32.ll   | 255 ++
 llvm/test/MC/AMDGPU/gfx12_asm_vop3p.s | 120 +
 llvm/test/MC/AMDGPU/gfx12_asm_vop3p_dpp16.s   |  24 ++
 .../MC/AMDGPU/gfx12_asm_vop3p_dpp16_err.s |  24 ++
 llvm/test/MC/AMDGPU/gfx12_asm_vop3p_dpp8.s|  24 ++
 .../test/MC/AMDGPU/gfx12_asm_vop3p_dpp8_err.s |  27 ++
 llvm/test/MC/AMDGPU/gfx12_asm_vop3p_err.s | 133 +
 .../Disassembler/AMDGPU/gfx12_dasm_vop3p.txt  | 120 +
 .../AMDGPU/gfx12_dasm_vop3p_dpp16.txt |  24 ++
 .../AMDGPU/gfx12_dasm_vop3p_dpp8.txt  |  24 ++
 19 files changed, 938 insertions(+), 12 deletions(-)
 create mode 100644 clang/test/CodeGenOpenCL/builtins-amdgcn-dl-insts-gfx12.cl
 create mode 100644 llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fdot4.f32.ll
 create mode 100644 llvm/test/MC/AMDGPU/gfx12_asm_vop3p_dpp16_err.s
 create mode 100644 llvm/test/MC/AMDGPU/gfx12_asm_vop3p_dpp8_err.s
 create mode 100644 llvm/test/MC/AMDGPU/gfx12_asm_vop3p_err.s

diff --git a/clang/include/clang/Basic/BuiltinsAMDGPU.def 
b/clang/include/clang/Basic/BuiltinsAMDGPU.def
index e562ef04a30194..1c1b9b2c9e9e8c 100644
--- a/clang/include/clang/Basic/BuiltinsAMDGPU.def
+++ b/clang/include/clang/Basic/BuiltinsAMDGPU.def
@@ -255,6 +255,10 @@ TARGET_BUILTIN(__builtin_amdgcn_sudot4, "iIbiIbiiIb", 
"nc", "dot8-insts")
 TARGET_BUILTIN(__builtin_amdgcn_sdot8, "SiSiSiSiIb", "nc", "dot1-insts")
 TARGET_BUILTIN(__builtin_amdgcn_udot8, "UiUiUiUiIb", "nc", "dot7-insts")
 TARGET_BUILTIN(__builtin_amdgcn_sudot8, "iIbiIbiiIb", "nc", "dot8-insts")
+TARGET_BUILTIN(__builtin_amdgcn_fdot4_f32_fp8_bf8, "fUiUif", "nc", 
"gfx12-insts")
+TARGET_BUILTIN(__builtin_amdgcn_fdot4_f32_bf8_fp8, "fUiUif", "nc", 
"gfx12-insts")
+TARGET_BUILTIN(__builtin_amdgcn_fdot4_f32_fp8_fp8, "fUiUif", "nc", 
"gfx12-insts")
+TARGET_BUILTIN(__builtin_amdgcn_fdot4_f32_bf8_bf8, "fUiUif", "nc", 
"gfx12-insts")
 
 
//===--===//
 // GFX10+ only builtins.
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn-dl-insts-err.cl 
b/clang/test/CodeGenOpenCL/builtins-amdgcn-dl-insts-err.cl
index 6573325150d958..1be47f71276208 100644
--- a/clang/test/CodeGenOpenCL/builtins-amdgcn-dl-insts-err.cl
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn-dl-insts-err.cl
@@ -49,4 +49,9 @@ kernel void builtins_amdgcn_dl_insts_err(
 
   iOut[3] = __builtin_amdgcn_sudot8(false, A, true, B, C, false);// 
expected-error {{'__builtin_amdgcn_sudot8' needs target feature dot8-insts}}
   iOut[4] = __builtin_amdgcn_sudot8(true, A, false, B, C, true); // 
expected-error {{'__builtin_amdgcn_sudot8' needs target feature dot8-insts}}
+
+  fOut[5] = __builtin_amdgcn_fdot4_f32_fp8_bf8(uiA, uiB, fC);// 
expected-error {{'__builtin_amdgcn_fdot4_f32_fp8_bf8' needs target feature 
gfx12-insts}}
+  fOut[6] = __builtin_amdgcn_fdot4_f32_bf8_fp8(uiA, uiB, fC);// 
expected-error {{'__builtin_amdgcn_fdot4_f32_bf8_fp8' needs target feature 
gfx12-insts}}
+  fOut[7] = __builtin_amdgcn_fdot4_f32_fp8_fp8(uiA, uiB, fC);// 
expected-error {{'__builtin_amdgcn_fdot4_f32_fp8_fp8' needs target feature 
gfx12-insts}}
+  fOut[8] = __builtin_amdgcn_fdot4_f32_bf8_bf8(uiA, uiB, fC);// 
expected-error {{'__builtin_amdgcn_fdot4_f32_bf8_bf8' needs target feature 
gfx12-insts}}
 }
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn-dl-insts-gfx12.cl 
b/clang/test/CodeGenOpenCL/builtins-amdgcn-dl-insts-gfx12.cl
new file mode 100644
index 00..31e10c0a5dc18c
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn-dl-insts-gfx12.cl
@@ -0,0 +1,20 @@
+// REQUIRES: amdgpu-registered-target
+
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu gfx1200 -S 
-emit-llvm -o - %s | FileCheck %s
+
+typedef unsigned int uint;
+
+// CHECK-LABEL: @builtins_amdgcn_dl_insts
+// CHECK: call float @llvm.amdgcn.fdot4.f32.fp8.bf8(i32 %uiA, i32 %uiB, float 
%fC)

[lldb] [clang] [compiler-rt] [flang] [lld] [llvm] [libcxx] [libc] [clang-tools-extra] AMDGPU/GFX12: Add new dot4 fp8/bf8 instructions (PR #77892)

2024-01-18 Thread Mariusz Sikora via cfe-commits


mariusz-sikora-at-amd wrote:

Rebase to run tests

https://github.com/llvm/llvm-project/pull/77892
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang-tools-extra] [AMDGPU] Work around s_getpc_b64 zero extending on GFX12 (PR #78186)

2024-01-18 Thread Jay Foad via cfe-commits


https://github.com/jayfoad updated 
https://github.com/llvm/llvm-project/pull/78186

>From d3f4ebf849f6ef1ea373e5c7f93398db6681b2b6 Mon Sep 17 00:00:00 2001
From: Jay Foad 
Date: Mon, 15 Jan 2024 15:02:08 +
Subject: [PATCH 1/4] Add GFX11/12 test coverage

---
 llvm/test/CodeGen/AMDGPU/s-getpc-b64-remat.ll | 103 +-
 1 file changed, 77 insertions(+), 26 deletions(-)

diff --git a/llvm/test/CodeGen/AMDGPU/s-getpc-b64-remat.ll 
b/llvm/test/CodeGen/AMDGPU/s-getpc-b64-remat.ll
index 598d7a8033c2e54..2c1baeeeda21697 100644
--- a/llvm/test/CodeGen/AMDGPU/s-getpc-b64-remat.ll
+++ b/llvm/test/CodeGen/AMDGPU/s-getpc-b64-remat.ll
@@ -1,32 +1,83 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -stress-regalloc=2 
-verify-machineinstrs < %s | FileCheck %s
-
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -stress-regalloc=2 
-verify-machineinstrs < %s | FileCheck %s -check-prefix=GFX9
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -stress-regalloc=2 
-verify-machineinstrs < %s | FileCheck %s -check-prefix=GFX11
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -stress-regalloc=2 
-verify-machineinstrs < %s | FileCheck %s -check-prefix=GFX12
 
 define void @test_remat_s_getpc_b64() {
-; CHECK-LABEL: test_remat_s_getpc_b64:
-; CHECK:   ; %bb.0: ; %entry
-; CHECK-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; CHECK-NEXT:s_xor_saveexec_b64 s[4:5], -1
-; CHECK-NEXT:buffer_store_dword v0, off, s[0:3], s32 ; 4-byte Folded Spill
-; CHECK-NEXT:s_mov_b64 exec, s[4:5]
-; CHECK-NEXT:v_writelane_b32 v0, s30, 0
-; CHECK-NEXT:s_getpc_b64 s[4:5]
-; CHECK-NEXT:v_writelane_b32 v0, s31, 1
-; CHECK-NEXT:;;#ASMSTART
-; CHECK-NEXT:;;#ASMEND
-; CHECK-NEXT:;;#ASMSTART
-; CHECK-NEXT:;;#ASMEND
-; CHECK-NEXT:s_getpc_b64 s[4:5]
-; CHECK-NEXT:v_mov_b32_e32 v1, s4
-; CHECK-NEXT:v_mov_b32_e32 v2, s5
-; CHECK-NEXT:global_store_dwordx2 v[1:2], v[1:2], off
-; CHECK-NEXT:v_readlane_b32 s31, v0, 1
-; CHECK-NEXT:v_readlane_b32 s30, v0, 0
-; CHECK-NEXT:s_xor_saveexec_b64 s[4:5], -1
-; CHECK-NEXT:buffer_load_dword v0, off, s[0:3], s32 ; 4-byte Folded Reload
-; CHECK-NEXT:s_mov_b64 exec, s[4:5]
-; CHECK-NEXT:s_waitcnt vmcnt(0)
-; CHECK-NEXT:s_setpc_b64 s[30:31]
+; GFX9-LABEL: test_remat_s_getpc_b64:
+; GFX9:   ; %bb.0: ; %entry
+; GFX9-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:s_xor_saveexec_b64 s[4:5], -1
+; GFX9-NEXT:buffer_store_dword v0, off, s[0:3], s32 ; 4-byte Folded Spill
+; GFX9-NEXT:s_mov_b64 exec, s[4:5]
+; GFX9-NEXT:v_writelane_b32 v0, s30, 0
+; GFX9-NEXT:s_getpc_b64 s[4:5]
+; GFX9-NEXT:v_writelane_b32 v0, s31, 1
+; GFX9-NEXT:;;#ASMSTART
+; GFX9-NEXT:;;#ASMEND
+; GFX9-NEXT:;;#ASMSTART
+; GFX9-NEXT:;;#ASMEND
+; GFX9-NEXT:s_getpc_b64 s[4:5]
+; GFX9-NEXT:v_mov_b32_e32 v1, s4
+; GFX9-NEXT:v_mov_b32_e32 v2, s5
+; GFX9-NEXT:global_store_dwordx2 v[1:2], v[1:2], off
+; GFX9-NEXT:v_readlane_b32 s31, v0, 1
+; GFX9-NEXT:v_readlane_b32 s30, v0, 0
+; GFX9-NEXT:s_xor_saveexec_b64 s[4:5], -1
+; GFX9-NEXT:buffer_load_dword v0, off, s[0:3], s32 ; 4-byte Folded Reload
+; GFX9-NEXT:s_mov_b64 exec, s[4:5]
+; GFX9-NEXT:s_waitcnt vmcnt(0)
+; GFX9-NEXT:s_setpc_b64 s[30:31]
+;
+; GFX11-LABEL: test_remat_s_getpc_b64:
+; GFX11:   ; %bb.0: ; %entry
+; GFX11-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:s_xor_saveexec_b32 s0, -1
+; GFX11-NEXT:scratch_store_b32 off, v0, s32 ; 4-byte Folded Spill
+; GFX11-NEXT:s_mov_b32 exec_lo, s0
+; GFX11-NEXT:v_writelane_b32 v0, s30, 0
+; GFX11-NEXT:s_getpc_b64 s[0:1]
+; GFX11-NEXT:;;#ASMSTART
+; GFX11-NEXT:;;#ASMEND
+; GFX11-NEXT:v_writelane_b32 v0, s31, 1
+; GFX11-NEXT:;;#ASMSTART
+; GFX11-NEXT:;;#ASMEND
+; GFX11-NEXT:s_getpc_b64 s[0:1]
+; GFX11-NEXT:s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | 
instid1(VALU_DEP_2)
+; GFX11-NEXT:v_dual_mov_b32 v2, s1 :: v_dual_mov_b32 v1, s0
+; GFX11-NEXT:v_readlane_b32 s31, v0, 1
+; GFX11-NEXT:v_readlane_b32 s30, v0, 0
+; GFX11-NEXT:global_store_b64 v[1:2], v[1:2], off
+; GFX11-NEXT:s_xor_saveexec_b32 s0, -1
+; GFX11-NEXT:scratch_load_b32 v0, off, s32 ; 4-byte Folded Reload
+; GFX11-NEXT:s_mov_b32 exec_lo, s0
+; GFX11-NEXT:s_waitcnt vmcnt(0)
+; GFX11-NEXT:s_setpc_b64 s[30:31]
+;
+; GFX12-LABEL: test_remat_s_getpc_b64:
+; GFX12:   ; %bb.0: ; %entry
+; GFX12-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX12-NEXT:s_xor_saveexec_b32 s0, -1
+; GFX12-NEXT:scratch_store_b32 off, v0, s32 ; 4-byte Folded Spill
+; GFX12-NEXT:s_mov_b32 exec_lo, s0
+; GFX12-NEXT:v_writelane_b32 v0, s30, 0
+; GFX12-NEXT:s_getpc_b64 s[0:1]
+; GFX12-NEXT:;;#ASMSTART
+; GFX12-NEXT:;;#ASMEND
+; GFX12-NEXT:v_writelane_b32 v0, s31, 1
+; GFX12-NEXT:;;#ASMSTART
+; GFX12-

[clang] 4c65787 - [AMDGPU] Add GFX12 __builtin_amdgcn_s_sleep_var (#77926)

2024-01-18 Thread via cfe-commits


Author: Jay Foad
Date: 2024-01-18T10:14:01Z
New Revision: 4c65787f1e45199713f71f63817651ff2decd96c

URL: 
https://github.com/llvm/llvm-project/commit/4c65787f1e45199713f71f63817651ff2decd96c
DIFF: 
https://github.com/llvm/llvm-project/commit/4c65787f1e45199713f71f63817651ff2decd96c.diff

LOG: [AMDGPU] Add GFX12 __builtin_amdgcn_s_sleep_var (#77926)

Added: 
clang/test/CodeGenOpenCL/builtins-amdgcn-gfx11-param-err.cl

Modified: 
clang/include/clang/Basic/BuiltinsAMDGPU.def
clang/test/CodeGenOpenCL/builtins-amdgcn-gfx11-err.cl
clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12.cl

Removed: 




diff  --git a/clang/include/clang/Basic/BuiltinsAMDGPU.def 
b/clang/include/clang/Basic/BuiltinsAMDGPU.def
index f02b4d321328fe..f80d182ec08908 100644
--- a/clang/include/clang/Basic/BuiltinsAMDGPU.def
+++ b/clang/include/clang/Basic/BuiltinsAMDGPU.def
@@ -410,6 +410,7 @@ TARGET_BUILTIN(__builtin_amdgcn_cvt_sr_fp8_f32, "ifiiIi", 
"nc", "fp8-conversion-
 // GFX12+ only builtins.
 
//===--===//
 
+TARGET_BUILTIN(__builtin_amdgcn_s_sleep_var, "vUi", "n", "gfx12-insts")
 TARGET_BUILTIN(__builtin_amdgcn_permlane16_var,  "UiUiUiUiIbIb", "nc", 
"gfx12-insts")
 TARGET_BUILTIN(__builtin_amdgcn_permlanex16_var, "UiUiUiUiIbIb", "nc", 
"gfx12-insts")
 TARGET_BUILTIN(__builtin_amdgcn_s_barrier_signal, "vIi", "n", "gfx12-insts")

diff  --git a/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx11-err.cl 
b/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx11-err.cl
index 00ecf32d949238..622e9dd2eed42f 100644
--- a/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx11-err.cl
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx11-err.cl
@@ -2,10 +2,7 @@
 
 // RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu gfx1100 -verify 
-S -emit-llvm -o - %s
 
-typedef unsigned int uint;
-typedef uint uint2 __attribute__((ext_vector_type(2)));
-typedef uint uint4 __attribute__((ext_vector_type(4)));
-
-kernel void builtins_amdgcn_bvh_err(global uint2* out, uint addr, uint data, 
uint4 data1, uint offset) {
-  *out = __builtin_amdgcn_ds_bvh_stack_rtn(addr, data, data1, offset); // 
expected-error {{'__builtin_amdgcn_ds_bvh_stack_rtn' must be a constant 
integer}}
+void test_s_sleep_var(int d)
+{
+  __builtin_amdgcn_s_sleep_var(d); // expected-error 
{{'__builtin_amdgcn_s_sleep_var' needs target feature gfx12-insts}}
 }

diff  --git a/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx11-param-err.cl 
b/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx11-param-err.cl
new file mode 100644
index 00..00ecf32d949238
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx11-param-err.cl
@@ -0,0 +1,11 @@
+// REQUIRES: amdgpu-registered-target
+
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu gfx1100 -verify 
-S -emit-llvm -o - %s
+
+typedef unsigned int uint;
+typedef uint uint2 __attribute__((ext_vector_type(2)));
+typedef uint uint4 __attribute__((ext_vector_type(4)));
+
+kernel void builtins_amdgcn_bvh_err(global uint2* out, uint addr, uint data, 
uint4 data1, uint offset) {
+  *out = __builtin_amdgcn_ds_bvh_stack_rtn(addr, data, data1, offset); // 
expected-error {{'__builtin_amdgcn_ds_bvh_stack_rtn' must be a constant 
integer}}
+}

diff  --git a/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12.cl 
b/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12.cl
index 2899d9e5c28898..ebd367bba0cdc1 100644
--- a/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12.cl
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12.cl
@@ -5,6 +5,21 @@
 
 typedef unsigned int uint;
 
+// CHECK-LABEL: @test_s_sleep_var(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[D_ADDR:%.*]] = alloca i32, align 4, addrspace(5)
+// CHECK-NEXT:store i32 [[D:%.*]], ptr addrspace(5) [[D_ADDR]], align 4
+// CHECK-NEXT:[[TMP0:%.*]] = load i32, ptr addrspace(5) [[D_ADDR]], align 4
+// CHECK-NEXT:call void @llvm.amdgcn.s.sleep.var(i32 [[TMP0]])
+// CHECK-NEXT:call void @llvm.amdgcn.s.sleep.var(i32 15)
+// CHECK-NEXT:ret void
+//
+void test_s_sleep_var(int d)
+{
+  __builtin_amdgcn_s_sleep_var(d);
+  __builtin_amdgcn_s_sleep_var(15);
+}
+
 // CHECK-LABEL: @test_permlane16_var(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[OUT_ADDR:%.*]] = alloca ptr addrspace(1), align 8, 
addrspace(5)



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Add GFX12 __builtin_amdgcn_s_sleep_var (PR #77926)

2024-01-18 Thread Jay Foad via cfe-commits


https://github.com/jayfoad closed 
https://github.com/llvm/llvm-project/pull/77926
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [lld] [llvm] [flang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits



@@ -253,6 +274,12 @@ AMDGPU::IsaVersion AMDGPU::getIsaVersion(StringRef GPU) {
   case GK_GFX1151: return {11, 5, 1};
   case GK_GFX1200: return {12, 0, 0};
   case GK_GFX1201: return {12, 0, 1};
+
+  // Generic targets use the earliest ISA version in their group.

Pierre-vh wrote:

I think it's alright as is, but this API is bad and should probably be 
refactored IMO. Most users of the API are just interested in checking the 
version major, sometimes minor (10.1 vs 10.3).

In theory, this API should _never_ be used to check for presence of a feature, 
that's always done through the feature list check, so it shouldn't really be 
abusable. I added a comment though to revisit this and make the intent clearer.

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][Diagnostics] Highlight code snippets (PR #66514)

2024-01-18 Thread Timm Baeder via cfe-commits


https://github.com/tbaederr updated 
https://github.com/llvm/llvm-project/pull/66514

>From 34ca28505542d55f62da80d8fd3c2561535185d3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Timm=20B=C3=A4der?= 
Date: Fri, 15 Sep 2023 15:51:39 +0200
Subject: [PATCH] [clang][Diagnostics] Highlight code snippets

Add some primitive syntax highlighting to our code snippet output.
---
 clang/include/clang/Frontend/TextDiagnostic.h |  18 +-
 clang/include/clang/Lex/Preprocessor.h|   5 +
 clang/lib/Frontend/TextDiagnostic.cpp | 197 +-
 clang/lib/Frontend/TextDiagnosticPrinter.cpp  |   2 +-
 clang/lib/Lex/Preprocessor.cpp|  24 +++
 5 files changed, 232 insertions(+), 14 deletions(-)

diff --git a/clang/include/clang/Frontend/TextDiagnostic.h 
b/clang/include/clang/Frontend/TextDiagnostic.h
index 7eb0ab0cdc9bca..a2fe8ae995423b 100644
--- a/clang/include/clang/Frontend/TextDiagnostic.h
+++ b/clang/include/clang/Frontend/TextDiagnostic.h
@@ -16,6 +16,7 @@
 #define LLVM_CLANG_FRONTEND_TEXTDIAGNOSTIC_H
 
 #include "clang/Frontend/DiagnosticRenderer.h"
+#include "llvm/Support/raw_ostream.h"
 
 namespace clang {
 
@@ -33,14 +34,22 @@ namespace clang {
 /// printing coming out of libclang.
 class TextDiagnostic : public DiagnosticRenderer {
   raw_ostream &OS;
+  const Preprocessor *PP;
 
 public:
-  TextDiagnostic(raw_ostream &OS,
- const LangOptions &LangOpts,
- DiagnosticOptions *DiagOpts);
+  TextDiagnostic(raw_ostream &OS, const LangOptions &LangOpts,
+ DiagnosticOptions *DiagOpts, const Preprocessor *PP = 
nullptr);
 
   ~TextDiagnostic() override;
 
+  struct StyleRange {
+unsigned Start;
+unsigned End;
+enum llvm::raw_ostream::Colors Color;
+StyleRange(unsigned S, unsigned E, enum llvm::raw_ostream::Colors C)
+: Start(S), End(E), Color(C){};
+  };
+
   /// Print the diagonstic level to a raw_ostream.
   ///
   /// This is a static helper that handles colorizing the level and formatting
@@ -104,7 +113,8 @@ class TextDiagnostic : public DiagnosticRenderer {
ArrayRef Hints);
 
   void emitSnippet(StringRef SourceLine, unsigned MaxLineNoDisplayWidth,
-   unsigned LineNo);
+   unsigned LineNo, unsigned DisplayLineNo,
+   ArrayRef Styles);
 
   void emitParseableFixits(ArrayRef Hints, const SourceManager &SM);
 };
diff --git a/clang/include/clang/Lex/Preprocessor.h 
b/clang/include/clang/Lex/Preprocessor.h
index 4ec21a8b6be2c8..d89e2be1bf5ff5 100644
--- a/clang/include/clang/Lex/Preprocessor.h
+++ b/clang/include/clang/Lex/Preprocessor.h
@@ -284,6 +284,8 @@ class Preprocessor {
   /// The kind of translation unit we are processing.
   const TranslationUnitKind TUKind;
 
+  const char *getCheckPoint(FileID FID, const char *Start) const;
+
 private:
   /// The code-completion handler.
   CodeCompletionHandler *CodeComplete = nullptr;
@@ -311,6 +313,9 @@ class Preprocessor {
   /// The import path for named module that we're currently processing.
   SmallVector, 2> 
NamedModuleImportPath;
 
+  llvm::DenseMap> CheckPoints;
+  unsigned CheckPointCounter = 0;
+
   /// Whether the import is an `@import` or a standard c++ modules import.
   bool IsAtImport = false;
 
diff --git a/clang/lib/Frontend/TextDiagnostic.cpp 
b/clang/lib/Frontend/TextDiagnostic.cpp
index 779dead5d058d1..691809ceded87b 100644
--- a/clang/lib/Frontend/TextDiagnostic.cpp
+++ b/clang/lib/Frontend/TextDiagnostic.cpp
@@ -12,6 +12,7 @@
 #include "clang/Basic/FileManager.h"
 #include "clang/Basic/SourceManager.h"
 #include "clang/Lex/Lexer.h"
+#include "clang/Lex/Preprocessor.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/Support/ConvertUTF.h"
@@ -41,6 +42,14 @@ static const enum raw_ostream::Colors fatalColor = 
raw_ostream::RED;
 static const enum raw_ostream::Colors savedColor =
   raw_ostream::SAVEDCOLOR;
 
+// Magenta is taken for 'warning'. Red is already 'error' and 'cyan'
+// is already taken for 'note'. Green is already used to underline
+// source ranges. White and black are bad because of the usual
+// terminal backgrounds. Which leaves us only with TWO options.
+static constexpr raw_ostream::Colors CommentColor = raw_ostream::YELLOW;
+static constexpr raw_ostream::Colors LiteralColor = raw_ostream::GREEN;
+static constexpr raw_ostream::Colors KeywordColor = raw_ostream::BLUE;
+
 /// Add highlights to differences in template strings.
 static void applyTemplateHighlighting(raw_ostream &OS, StringRef Str,
   bool &Normal, bool Bold) {
@@ -644,10 +653,10 @@ static bool printWordWrapped(raw_ostream &OS, StringRef 
Str, unsigned Columns,
   return Wrapped;
 }
 
-TextDiagnostic::TextDiagnostic(raw_ostream &OS,
-   const LangOptions &LangOpts,
-   DiagnosticOptions *DiagOpts)
-  : DiagnosticRenderer(LangOpts, DiagOpts), OS(OS) {}
+TextDiagnostic:

[clang] [lld] [llvm] [flang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits



@@ -520,6 +520,106 @@ Every processor supports every OS ABI (see 
:ref:`amdgpu-os`) with the following
 
  === ===  = = 
=== === ==
 
+Generic processors also exist. They group multiple processors into one,
+allowing to build code once and run it on multiple targets at the cost
+of less features being available.
+
+Generic processors are only available on Code Object V6 and up.
+
+  .. table:: AMDGPU Generic Processors
+ :name: amdgpu-generic-processor-table
+
+  == = 
=
+ Processor TargetSupported Target
+   TripleProcessorsFeatures
+   ArchitectureRestrictions
+
+
+
+
+
+
+
+
+  == = 
=
+ ``gfx9-generic`` ``amdgcn`` - ``gfx900``  - ``v_mad_mix`` 
instructions
+ - ``gfx902``are not available 
on
+ - ``gfx904````gfx900``, 
``gfx902``,
+ - ``gfx906````gfx909``, 
``gfx90c``
+ - ``gfx909``  - ``v_fma_mix`` 
instructions
+ - ``gfx90c``are not available 
on ``gfx904``
+   - sramecc is not 
available on

Pierre-vh wrote:

No, for unsupported: `EF_AMDGPU_FEATURE_SRAMECC_UNSUPPORTED_V4`

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [llvm] [AMDGPU] Work around s_getpc_b64 zero extending on GFX12 (PR #78186)

2024-01-18 Thread Jay Foad via cfe-commits


https://github.com/jayfoad updated 
https://github.com/llvm/llvm-project/pull/78186

>From d3f4ebf849f6ef1ea373e5c7f93398db6681b2b6 Mon Sep 17 00:00:00 2001
From: Jay Foad 
Date: Mon, 15 Jan 2024 15:02:08 +
Subject: [PATCH 1/4] Add GFX11/12 test coverage

---
 llvm/test/CodeGen/AMDGPU/s-getpc-b64-remat.ll | 103 +-
 1 file changed, 77 insertions(+), 26 deletions(-)

diff --git a/llvm/test/CodeGen/AMDGPU/s-getpc-b64-remat.ll 
b/llvm/test/CodeGen/AMDGPU/s-getpc-b64-remat.ll
index 598d7a8033c2e54..2c1baeeeda21697 100644
--- a/llvm/test/CodeGen/AMDGPU/s-getpc-b64-remat.ll
+++ b/llvm/test/CodeGen/AMDGPU/s-getpc-b64-remat.ll
@@ -1,32 +1,83 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -stress-regalloc=2 
-verify-machineinstrs < %s | FileCheck %s
-
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -stress-regalloc=2 
-verify-machineinstrs < %s | FileCheck %s -check-prefix=GFX9
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -stress-regalloc=2 
-verify-machineinstrs < %s | FileCheck %s -check-prefix=GFX11
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -stress-regalloc=2 
-verify-machineinstrs < %s | FileCheck %s -check-prefix=GFX12
 
 define void @test_remat_s_getpc_b64() {
-; CHECK-LABEL: test_remat_s_getpc_b64:
-; CHECK:   ; %bb.0: ; %entry
-; CHECK-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; CHECK-NEXT:s_xor_saveexec_b64 s[4:5], -1
-; CHECK-NEXT:buffer_store_dword v0, off, s[0:3], s32 ; 4-byte Folded Spill
-; CHECK-NEXT:s_mov_b64 exec, s[4:5]
-; CHECK-NEXT:v_writelane_b32 v0, s30, 0
-; CHECK-NEXT:s_getpc_b64 s[4:5]
-; CHECK-NEXT:v_writelane_b32 v0, s31, 1
-; CHECK-NEXT:;;#ASMSTART
-; CHECK-NEXT:;;#ASMEND
-; CHECK-NEXT:;;#ASMSTART
-; CHECK-NEXT:;;#ASMEND
-; CHECK-NEXT:s_getpc_b64 s[4:5]
-; CHECK-NEXT:v_mov_b32_e32 v1, s4
-; CHECK-NEXT:v_mov_b32_e32 v2, s5
-; CHECK-NEXT:global_store_dwordx2 v[1:2], v[1:2], off
-; CHECK-NEXT:v_readlane_b32 s31, v0, 1
-; CHECK-NEXT:v_readlane_b32 s30, v0, 0
-; CHECK-NEXT:s_xor_saveexec_b64 s[4:5], -1
-; CHECK-NEXT:buffer_load_dword v0, off, s[0:3], s32 ; 4-byte Folded Reload
-; CHECK-NEXT:s_mov_b64 exec, s[4:5]
-; CHECK-NEXT:s_waitcnt vmcnt(0)
-; CHECK-NEXT:s_setpc_b64 s[30:31]
+; GFX9-LABEL: test_remat_s_getpc_b64:
+; GFX9:   ; %bb.0: ; %entry
+; GFX9-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:s_xor_saveexec_b64 s[4:5], -1
+; GFX9-NEXT:buffer_store_dword v0, off, s[0:3], s32 ; 4-byte Folded Spill
+; GFX9-NEXT:s_mov_b64 exec, s[4:5]
+; GFX9-NEXT:v_writelane_b32 v0, s30, 0
+; GFX9-NEXT:s_getpc_b64 s[4:5]
+; GFX9-NEXT:v_writelane_b32 v0, s31, 1
+; GFX9-NEXT:;;#ASMSTART
+; GFX9-NEXT:;;#ASMEND
+; GFX9-NEXT:;;#ASMSTART
+; GFX9-NEXT:;;#ASMEND
+; GFX9-NEXT:s_getpc_b64 s[4:5]
+; GFX9-NEXT:v_mov_b32_e32 v1, s4
+; GFX9-NEXT:v_mov_b32_e32 v2, s5
+; GFX9-NEXT:global_store_dwordx2 v[1:2], v[1:2], off
+; GFX9-NEXT:v_readlane_b32 s31, v0, 1
+; GFX9-NEXT:v_readlane_b32 s30, v0, 0
+; GFX9-NEXT:s_xor_saveexec_b64 s[4:5], -1
+; GFX9-NEXT:buffer_load_dword v0, off, s[0:3], s32 ; 4-byte Folded Reload
+; GFX9-NEXT:s_mov_b64 exec, s[4:5]
+; GFX9-NEXT:s_waitcnt vmcnt(0)
+; GFX9-NEXT:s_setpc_b64 s[30:31]
+;
+; GFX11-LABEL: test_remat_s_getpc_b64:
+; GFX11:   ; %bb.0: ; %entry
+; GFX11-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:s_xor_saveexec_b32 s0, -1
+; GFX11-NEXT:scratch_store_b32 off, v0, s32 ; 4-byte Folded Spill
+; GFX11-NEXT:s_mov_b32 exec_lo, s0
+; GFX11-NEXT:v_writelane_b32 v0, s30, 0
+; GFX11-NEXT:s_getpc_b64 s[0:1]
+; GFX11-NEXT:;;#ASMSTART
+; GFX11-NEXT:;;#ASMEND
+; GFX11-NEXT:v_writelane_b32 v0, s31, 1
+; GFX11-NEXT:;;#ASMSTART
+; GFX11-NEXT:;;#ASMEND
+; GFX11-NEXT:s_getpc_b64 s[0:1]
+; GFX11-NEXT:s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | 
instid1(VALU_DEP_2)
+; GFX11-NEXT:v_dual_mov_b32 v2, s1 :: v_dual_mov_b32 v1, s0
+; GFX11-NEXT:v_readlane_b32 s31, v0, 1
+; GFX11-NEXT:v_readlane_b32 s30, v0, 0
+; GFX11-NEXT:global_store_b64 v[1:2], v[1:2], off
+; GFX11-NEXT:s_xor_saveexec_b32 s0, -1
+; GFX11-NEXT:scratch_load_b32 v0, off, s32 ; 4-byte Folded Reload
+; GFX11-NEXT:s_mov_b32 exec_lo, s0
+; GFX11-NEXT:s_waitcnt vmcnt(0)
+; GFX11-NEXT:s_setpc_b64 s[30:31]
+;
+; GFX12-LABEL: test_remat_s_getpc_b64:
+; GFX12:   ; %bb.0: ; %entry
+; GFX12-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX12-NEXT:s_xor_saveexec_b32 s0, -1
+; GFX12-NEXT:scratch_store_b32 off, v0, s32 ; 4-byte Folded Spill
+; GFX12-NEXT:s_mov_b32 exec_lo, s0
+; GFX12-NEXT:v_writelane_b32 v0, s30, 0
+; GFX12-NEXT:s_getpc_b64 s[0:1]
+; GFX12-NEXT:;;#ASMSTART
+; GFX12-NEXT:;;#ASMEND
+; GFX12-NEXT:v_writelane_b32 v0, s31, 1
+; GFX12-NEXT:;;#ASMSTART
+; GFX12-

[clang-tools-extra] [clang] [llvm] [AMDGPU] Work around s_getpc_b64 zero extending on GFX12 (PR #78186)

2024-01-18 Thread Jay Foad via cfe-commits


https://github.com/jayfoad closed 
https://github.com/llvm/llvm-project/pull/78186
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] Adding support of AMDLIBM vector library (PR #78560)

2024-01-18 Thread Rohit Aggarwal via cfe-commits


https://github.com/rohitaggarwal007 created 
https://github.com/llvm/llvm-project/pull/78560

Hi,

AMD has it's own implementation of vector calls. This patch include the changes 
to enable the use of AMD's math library using -fveclib=AMDLIBM.



>From d2e001b9f6b174b6313f99c4a094ab3714548806 Mon Sep 17 00:00:00 2001
From: Rohit Aggarwal 
Date: Thu, 18 Jan 2024 14:03:50 +0530
Subject: [PATCH] Adding support of AMDLIBM vector library

---
 clang/include/clang/Driver/Options.td |   4 +-
 clang/test/Driver/autocomplete.c  |   1 +
 .../include/llvm/Analysis/TargetLibraryInfo.h |   3 +-
 .../llvm/Frontend/Driver/CodeGenOptions.h |   3 +-
 llvm/lib/Analysis/TargetLibraryInfo.cpp   | 211 -
 llvm/lib/Frontend/Driver/CodeGenOptions.cpp   |   4 +
 .../Generic/replace-intrinsics-with-veclib.ll |  11 +
 .../LoopVectorize/X86/amdlibm-calls-finite.ll | 332 
 .../LoopVectorize/X86/amdlibm-calls.ll| 747 ++
 .../Transforms/SLPVectorizer/X86/sin-sqrt.ll  |  29 +-
 llvm/test/Transforms/Util/add-TLI-mappings.ll |  23 +
 11 files changed, 1362 insertions(+), 6 deletions(-)
 create mode 100644 
llvm/test/Transforms/LoopVectorize/X86/amdlibm-calls-finite.ll
 create mode 100644 llvm/test/Transforms/LoopVectorize/X86/amdlibm-calls.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index e4fdad8265c863..2fbe1f49a79aab 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3190,10 +3190,10 @@ def fno_experimental_isel : Flag<["-"], 
"fno-experimental-isel">, Group,
-
Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,none">,
+
Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">,
 NormalizedValuesScope<"llvm::driver::VectorLibrary">,
 NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF",
-  "Darwin_libsystem_m", "ArmPL", "NoLibrary"]>,
+  "Darwin_libsystem_m", "ArmPL", "AMDLIBM", "NoLibrary"]>,
 MarshallingInfoEnum, "NoLibrary">;
 def fno_lax_vector_conversions : Flag<["-"], "fno-lax-vector-conversions">, 
Group,
   Alias, AliasArgs<["none"]>;
diff --git a/clang/test/Driver/autocomplete.c b/clang/test/Driver/autocomplete.c
index d6f57708b67eb6..c8ceaaf404672f 100644
--- a/clang/test/Driver/autocomplete.c
+++ b/clang/test/Driver/autocomplete.c
@@ -80,6 +80,7 @@
 // FLTOALL-NEXT: thin
 // RUN: %clang --autocomplete=-fveclib= | FileCheck %s -check-prefix=FVECLIBALL
 // FVECLIBALL: Accelerate
+// FVECLIBALL-NEXT: AMDLIBM
 // FVECLIBALL-NEXT: ArmPL
 // FVECLIBALL-NEXT: Darwin_libsystem_m
 // FVECLIBALL-NEXT: libmvec
diff --git a/llvm/include/llvm/Analysis/TargetLibraryInfo.h 
b/llvm/include/llvm/Analysis/TargetLibraryInfo.h
index daf1d8e2079f85..4a3edb8f02a7a8 100644
--- a/llvm/include/llvm/Analysis/TargetLibraryInfo.h
+++ b/llvm/include/llvm/Analysis/TargetLibraryInfo.h
@@ -129,7 +129,8 @@ class TargetLibraryInfoImpl {
 MASSV,// IBM MASS vector library.
 SVML, // Intel short vector math library.
 SLEEFGNUABI, // SLEEF - SIMD Library for Evaluating Elementary Functions.
-ArmPL// Arm Performance Libraries.
+ArmPL,   // Arm Performance Libraries.
+AMDLIBM
   };
 
   TargetLibraryInfoImpl();
diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h 
b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h
index 0b1d924a26b2de..0180670c4c6991 100644
--- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h
+++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h
@@ -29,7 +29,8 @@ enum class VectorLibrary {
   SVML,   // Intel short vector math library.
   SLEEF,  // SLEEF SIMD Library for Evaluating Elementary 
Functions.
   Darwin_libsystem_m, // Use Darwin's libsystem_m vector functions.
-  ArmPL   // Arm Performance Libraries.
+  ArmPL,  // Arm Performance Libraries.
+  AMDLIBM // AMD vector math library.
 };
 
 TargetLibraryInfoImpl *createTLII(llvm::Triple &TargetTriple,
diff --git a/llvm/lib/Analysis/TargetLibraryInfo.cpp 
b/llvm/lib/Analysis/TargetLibraryInfo.cpp
index 58749e559040a7..16afc33bf7ce88 100644
--- a/llvm/lib/Analysis/TargetLibraryInfo.cpp
+++ b/llvm/lib/Analysis/TargetLibraryInfo.cpp
@@ -37,7 +37,9 @@ static cl::opt 
ClVectorLibrary(
clEnumValN(TargetLibraryInfoImpl::SLEEFGNUABI, "sleefgnuabi",
   "SIMD Library for Evaluating Elementary Functions"),
clEnumValN(TargetLibraryInfoImpl::ArmPL, "ArmPL",
-  "Arm Performance Libraries")));
+  "Arm Performance Libraries"),
+   clEnumValN(TargetLibraryInfoImpl::AMDLIBM, "AMDLIBM",
+  "AMD vector math library")));
 
 StringLiteral const TargetLibraryInfoImpl::StandardNames[LibFunc::NumLibFuncs] 
=
 {
@@ -1279,6 +1281,213 @@ void 
TargetLibraryInfoIm

[clang] [llvm] Adding support of AMDLIBM vector library (PR #78560)

2024-01-18 Thread via cfe-commits


github-actions[bot] wrote:

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this 
page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using `@` followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from 
other developers.

If you have further questions, they may be answered by the [LLVM GitHub User 
Guide](https://llvm.org/docs/GitHub.html).

You can also ask questions in a comment on this PR, on the [LLVM 
Discord](https://discord.com/invite/xS7Z362) or on the 
[forums](https://discourse.llvm.org/).

https://github.com/llvm/llvm-project/pull/78560
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] Adding support of AMDLIBM vector library (PR #78560)

2024-01-18 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-clang-driver

Author: Rohit Aggarwal (rohitaggarwal007)


Changes

Hi,

AMD has it's own implementation of vector calls. This patch include the changes 
to enable the use of AMD's math library using -fveclib=AMDLIBM.



---

Patch is 60.65 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/78560.diff


11 Files Affected:

- (modified) clang/include/clang/Driver/Options.td (+2-2) 
- (modified) clang/test/Driver/autocomplete.c (+1) 
- (modified) llvm/include/llvm/Analysis/TargetLibraryInfo.h (+2-1) 
- (modified) llvm/include/llvm/Frontend/Driver/CodeGenOptions.h (+2-1) 
- (modified) llvm/lib/Analysis/TargetLibraryInfo.cpp (+210-1) 
- (modified) llvm/lib/Frontend/Driver/CodeGenOptions.cpp (+4) 
- (modified) llvm/test/CodeGen/Generic/replace-intrinsics-with-veclib.ll (+11) 
- (added) llvm/test/Transforms/LoopVectorize/X86/amdlibm-calls-finite.ll (+332) 
- (added) llvm/test/Transforms/LoopVectorize/X86/amdlibm-calls.ll (+747) 
- (modified) llvm/test/Transforms/SLPVectorizer/X86/sin-sqrt.ll (+28-1) 
- (modified) llvm/test/Transforms/Util/add-TLI-mappings.ll (+23) 


``diff
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index e4fdad8265c863..2fbe1f49a79aab 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3190,10 +3190,10 @@ def fno_experimental_isel : Flag<["-"], 
"fno-experimental-isel">, Group,
-
Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,none">,
+
Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">,
 NormalizedValuesScope<"llvm::driver::VectorLibrary">,
 NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF",
-  "Darwin_libsystem_m", "ArmPL", "NoLibrary"]>,
+  "Darwin_libsystem_m", "ArmPL", "AMDLIBM", "NoLibrary"]>,
 MarshallingInfoEnum, "NoLibrary">;
 def fno_lax_vector_conversions : Flag<["-"], "fno-lax-vector-conversions">, 
Group,
   Alias, AliasArgs<["none"]>;
diff --git a/clang/test/Driver/autocomplete.c b/clang/test/Driver/autocomplete.c
index d6f57708b67eb6..c8ceaaf404672f 100644
--- a/clang/test/Driver/autocomplete.c
+++ b/clang/test/Driver/autocomplete.c
@@ -80,6 +80,7 @@
 // FLTOALL-NEXT: thin
 // RUN: %clang --autocomplete=-fveclib= | FileCheck %s -check-prefix=FVECLIBALL
 // FVECLIBALL: Accelerate
+// FVECLIBALL-NEXT: AMDLIBM
 // FVECLIBALL-NEXT: ArmPL
 // FVECLIBALL-NEXT: Darwin_libsystem_m
 // FVECLIBALL-NEXT: libmvec
diff --git a/llvm/include/llvm/Analysis/TargetLibraryInfo.h 
b/llvm/include/llvm/Analysis/TargetLibraryInfo.h
index daf1d8e2079f85..4a3edb8f02a7a8 100644
--- a/llvm/include/llvm/Analysis/TargetLibraryInfo.h
+++ b/llvm/include/llvm/Analysis/TargetLibraryInfo.h
@@ -129,7 +129,8 @@ class TargetLibraryInfoImpl {
 MASSV,// IBM MASS vector library.
 SVML, // Intel short vector math library.
 SLEEFGNUABI, // SLEEF - SIMD Library for Evaluating Elementary Functions.
-ArmPL// Arm Performance Libraries.
+ArmPL,   // Arm Performance Libraries.
+AMDLIBM
   };
 
   TargetLibraryInfoImpl();
diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h 
b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h
index 0b1d924a26b2de..0180670c4c6991 100644
--- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h
+++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h
@@ -29,7 +29,8 @@ enum class VectorLibrary {
   SVML,   // Intel short vector math library.
   SLEEF,  // SLEEF SIMD Library for Evaluating Elementary 
Functions.
   Darwin_libsystem_m, // Use Darwin's libsystem_m vector functions.
-  ArmPL   // Arm Performance Libraries.
+  ArmPL,  // Arm Performance Libraries.
+  AMDLIBM // AMD vector math library.
 };
 
 TargetLibraryInfoImpl *createTLII(llvm::Triple &TargetTriple,
diff --git a/llvm/lib/Analysis/TargetLibraryInfo.cpp 
b/llvm/lib/Analysis/TargetLibraryInfo.cpp
index 58749e559040a7..16afc33bf7ce88 100644
--- a/llvm/lib/Analysis/TargetLibraryInfo.cpp
+++ b/llvm/lib/Analysis/TargetLibraryInfo.cpp
@@ -37,7 +37,9 @@ static cl::opt 
ClVectorLibrary(
clEnumValN(TargetLibraryInfoImpl::SLEEFGNUABI, "sleefgnuabi",
   "SIMD Library for Evaluating Elementary Functions"),
clEnumValN(TargetLibraryInfoImpl::ArmPL, "ArmPL",
-  "Arm Performance Libraries")));
+  "Arm Performance Libraries"),
+   clEnumValN(TargetLibraryInfoImpl::AMDLIBM, "AMDLIBM",
+  "AMD vector math library")));
 
 StringLiteral const TargetLibraryInfoImpl::StandardNames[LibFunc::NumLibFuncs] 
=
 {
@@ -1279,6 +1281,213 @@ void 
TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib(
 }
 break;
   }
+  case AMDLIBM: {
+#define FIXED(NL) ElementCount::getFi

[clang-tools-extra] [llvm] [clang] Add clang-tidy check to suggest replacement of conditional statement with std::min/std::max (PR #77816)

2024-01-18 Thread Bhuminjay Soni via cfe-commits


11happy wrote:

A humble reminder. @PiotrZSL Can you please review this and suggest if any 
changes required?.
Thank you

https://github.com/llvm/llvm-project/pull/77816
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clang] [llvm] [CGP] Avoid replacing a free ext with multiple other exts. (PR #77094)

2024-01-18 Thread Florian Hahn via cfe-commits


https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/77094

>From 46fbecfce6c48795ea85fc9420067479f6d0b17a Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Fri, 5 Jan 2024 11:24:59 +
Subject: [PATCH 1/2] [CGP] Avoid replacing a free ext with multiple other
 exts.

Replacing a free extension with 2 or more extensions unnecessarily
increases the number of IR instructions without providing any benefits.
It also unnecessarily causes operations to be performed on wider types
than necessary.

In some cases, the extra extensions also pessimize codegen (see
bfis-in-loop.ll).

The changes in arm64-codegen-prepare-extload.ll also show that we avoid
promotions that should only be performed in stress mode.
---
 llvm/lib/CodeGen/CodeGenPrepare.cpp   |  7 ++-
 .../AArch64/arm64-codegen-prepare-extload.ll  | 22 ---
 .../AArch64/avoid-free-ext-promotion.ll   | 12 ++--
 llvm/test/CodeGen/AArch64/bfis-in-loop.ll | 50 +++
 ...iller-impdef-on-implicit-def-regression.ll | 63 +--
 5 files changed, 81 insertions(+), 73 deletions(-)

diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp 
b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index 5bd4c6b067d796..606946ceffd4f3 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -5965,7 +5965,9 @@ bool CodeGenPrepare::tryToPromoteExts(
 // cut this search path, because it means we degrade the code quality.
 // With exactly 2, the transformation is neutral, because we will merge
 // one extension but leave one. However, we optimistically keep going,
-// because the new extension may be removed too.
+// because the new extension may be removed too. Also avoid replacing a
+// single free extension with multiple extensions, as this increases the
+// number of IR instructions while providing any savings.
 long long TotalCreatedInstsCost = CreatedInstsCost + NewCreatedInstsCost;
 // FIXME: It would be possible to propagate a negative value instead of
 // conservatively ceiling it to 0.
@@ -5973,7 +5975,8 @@ bool CodeGenPrepare::tryToPromoteExts(
 std::max((long long)0, (TotalCreatedInstsCost - ExtCost));
 if (!StressExtLdPromotion &&
 (TotalCreatedInstsCost > 1 ||
- !isPromotedInstructionLegal(*TLI, *DL, PromotedVal))) {
+ !isPromotedInstructionLegal(*TLI, *DL, PromotedVal) ||
+ (ExtCost == 0 && NewExts.size() > 1))) {
   // This promotion is not profitable, rollback to the previous state, and
   // save the current extension in ProfitablyMovedExts as the latest
   // speculative promotion turned out to be unprofitable.
diff --git a/llvm/test/CodeGen/AArch64/arm64-codegen-prepare-extload.ll 
b/llvm/test/CodeGen/AArch64/arm64-codegen-prepare-extload.ll
index 23cbad0d15b4c1..646f988f574813 100644
--- a/llvm/test/CodeGen/AArch64/arm64-codegen-prepare-extload.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-codegen-prepare-extload.ll
@@ -528,10 +528,14 @@ entry:
 ; OPTALL: [[LD:%[a-zA-Z_0-9-]+]] = load i8, ptr %p
 ;
 ; This transformation should really happen only for stress mode.
-; OPT-NEXT: [[ZEXT64:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i64
-; OPT-NEXT: [[ZEXTB:%[a-zA-Z_0-9-]+]] = zext i32 %b to i64
-; OPT-NEXT: [[IDX64:%[a-zA-Z_0-9-]+]] = add nuw i64 [[ZEXT64]], [[ZEXTB]]
-; OPT-NEXT: [[RES32:%[a-zA-Z_0-9-]+]] = trunc i64 [[IDX64]] to i32
+; STRESS-NEXT: [[ZEXT64:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i64
+; STRESS-NEXT: [[ZEXTB:%[a-zA-Z_0-9-]+]] = zext i32 %b to i64
+; STRESS-NEXT: [[IDX64:%[a-zA-Z_0-9-]+]] = add nuw i64 [[ZEXT64]], [[ZEXTB]]
+; STRESS-NEXT: [[RES32:%[a-zA-Z_0-9-]+]] = trunc i64 [[IDX64]] to i32
+;
+; NONSTRESS-NEXT: [[ZEXT32:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i32
+; NONSTRESS-NEXT: [[RES32:%[a-zA-Z_0-9-]+]] = add nuw i32 [[ZEXT32]], %b
+; NONSTRESS-NEXT: [[IDX64:%[a-zA-Z_0-9-]+]] = zext i32 [[RES32]] to i64
 ;
 ; DISABLE-NEXT: [[ZEXT32:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i32
 ; DISABLE-NEXT: [[RES32:%[a-zA-Z_0-9-]+]] = add nuw i32 [[ZEXT32]], %b
@@ -583,9 +587,13 @@ entry:
 ; OPTALL: [[LD:%[a-zA-Z_0-9-]+]] = load i8, ptr %p
 ;
 ; This transformation should really happen only for stress mode.
-; OPT-NEXT: [[ZEXT64:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i64
-; OPT-NEXT: [[ZEXTB:%[a-zA-Z_0-9-]+]] = zext i32 %b to i64
-; OPT-NEXT: [[IDX64:%[a-zA-Z_0-9-]+]] = add nuw i64 [[ZEXT64]], [[ZEXTB]]
+; STRESS-NEXT: [[ZEXT64:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i64
+; STRESS-NEXT: [[ZEXTB:%[a-zA-Z_0-9-]+]] = zext i32 %b to i64
+; STRESS-NEXT: [[IDX64:%[a-zA-Z_0-9-]+]] = add nuw i64 [[ZEXT64]], [[ZEXTB]]
+;
+; NONSTRESS-NEXT: [[ZEXT32:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i32
+; NONSTRESS-NEXT: [[RES32:%[a-zA-Z_0-9-]+]] = add nuw i32 [[ZEXT32]], %b
+; NONSTRESS-NEXT: [[IDX64:%[a-zA-Z_0-9-]+]] = zext i32 [[RES32]] to i64
 ;
 ; DISABLE-NEXT: [[ZEXT32:%[a-zA-Z_0-9-]+]] = zext i8 [[LD]] to i32
 ; DISABLE-NEXT: [[RES32:%[a-zA-Z_0-9-]+]] = add nuw i32 [[ZEXT32]], %b
diff --git a/llvm/test/CodeGen/AArc

[clang-tools-extra] [clang] [llvm] [AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (PR #77438)

2024-01-18 Thread Jay Foad via cfe-commits


https://github.com/jayfoad closed 
https://github.com/llvm/llvm-project/pull/77438
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clang] [llvm] [CGP] Avoid replacing a free ext with multiple other exts. (PR #77094)

2024-01-18 Thread Florian Hahn via cfe-commits


https://github.com/fhahn closed https://github.com/llvm/llvm-project/pull/77094
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [coroutines][coro_lifetimebound] Detect lifetime issues with lambda captures (PR #77066)

2024-01-18 Thread Utkarsh Saxena via cfe-commits



@@ -11220,6 +11220,11 @@ class Sema final {
   VarDecl *buildCoroutinePromise(SourceLocation Loc);
   void CheckCompletedCoroutineBody(FunctionDecl *FD, Stmt *&Body);
 
+  // Heuristically tells if the function is get_return_object by matching

usx95 wrote:

Both done.

https://github.com/llvm/llvm-project/pull/77066
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [coroutines][coro_lifetimebound] Detect lifetime issues with lambda captures (PR #77066)

2024-01-18 Thread Utkarsh Saxena via cfe-commits


https://github.com/usx95 updated https://github.com/llvm/llvm-project/pull/77066

>From 3e0d0ab6c4fc6cba68285816a95e423bc18e8e55 Mon Sep 17 00:00:00 2001
From: Utkarsh Saxena 
Date: Fri, 5 Jan 2024 10:11:20 +0100
Subject: [PATCH 01/16] [coroutines] Detect lifetime issues with coroutine
 lambda captures

---
 clang/lib/Sema/SemaInit.cpp   | 20 +--
 clang/test/SemaCXX/coro-lifetimebound.cpp | 64 +--
 2 files changed, 76 insertions(+), 8 deletions(-)

diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 60c0e3e74204ec..c100bf11454786 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -12,6 +12,7 @@
 
 #include "clang/AST/ASTContext.h"
 #include "clang/AST/DeclObjC.h"
+#include "clang/AST/Expr.h"
 #include "clang/AST/ExprCXX.h"
 #include "clang/AST/ExprObjC.h"
 #include "clang/AST/ExprOpenMP.h"
@@ -33,6 +34,7 @@
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Support/Casting.h"
 #include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/raw_ostream.h"
 
@@ -7575,15 +7577,27 @@ static void 
visitLifetimeBoundArguments(IndirectLocalPath &Path, Expr *Call,
 Path.pop_back();
   };
 
-  if (ObjectArg && implicitObjectParamIsLifetimeBound(Callee))
-VisitLifetimeBoundArg(Callee, ObjectArg);
-
   bool CheckCoroCall = false;
   if (const auto *RD = Callee->getReturnType()->getAsRecordDecl()) {
 CheckCoroCall = RD->hasAttr() &&
 RD->hasAttr() &&
 !Callee->hasAttr();
   }
+
+  if (ObjectArg) {
+bool CheckCoroObjArg = CheckCoroCall;
+// Ignore `__promise.get_return_object()` as it not lifetimebound.
+if (Callee->getDeclName().isIdentifier() &&
+Callee->getName() == "get_return_object")
+  CheckCoroObjArg = false;
+// Coroutine lambda objects with empty capture list are not lifetimebound.
+if (auto *LE = dyn_cast(ObjectArg->IgnoreImplicit());
+LE && LE->captures().empty())
+  CheckCoroObjArg = false;
+if (implicitObjectParamIsLifetimeBound(Callee) || CheckCoroObjArg)
+  VisitLifetimeBoundArg(Callee, ObjectArg);
+  }
+
   for (unsigned I = 0,
 N = std::min(Callee->getNumParams(), Args.size());
I != N; ++I) {
diff --git a/clang/test/SemaCXX/coro-lifetimebound.cpp 
b/clang/test/SemaCXX/coro-lifetimebound.cpp
index 3fc7ca70a14a12..319134450e4b6f 100644
--- a/clang/test/SemaCXX/coro-lifetimebound.cpp
+++ b/clang/test/SemaCXX/coro-lifetimebound.cpp
@@ -64,6 +64,10 @@ Co bar_coro(const int &b, int c) {
   : bar_coro(0, 1); // expected-warning {{returning address of local 
temporary object}}
 }
 
+// 
=
+// Lambdas
+// 
=
+namespace lambdas {
 void lambdas() {
   auto unsafe_lambda = [] [[clang::coro_wrapper]] (int b) {
 return foo_coro(b); // expected-warning {{address of stack memory 
associated with parameter}}
@@ -84,15 +88,47 @@ void lambdas() {
 co_return x + co_await foo_coro(b);
   };
 }
+
+Co lambda_captures() {
+  int a = 1;
+  // Temporary lambda object dies.
+  auto lamb = [a](int x, const int& y) -> Co { // expected-warning 
{{temporary whose address is used as value of local variable 'lamb'}}
+co_return x + y + a;
+  }(1, a);
+  // Object dies but it has no capture.
+  auto no_capture = []() -> Co { co_return 1; }();
+  auto bad_no_capture = [](const int& a) -> Co { co_return a; }(1); // 
expected-warning {{temporary}}
+  // Temporary lambda object with lifetime extension under co_await.
+  int res = co_await [a](int x, const int& y) -> Co {
+co_return x + y + a;
+  }(1, a);
+  co_return 1;
+}
+} // namespace lambdas
+
 // 
=
-// Safe usage when parameters are value
+// Member coroutines
 // 
=
-namespace by_value {
-Co value_coro(int b) { co_return co_await foo_coro(b); }
-[[clang::coro_wrapper]] Co wrapper1(int b) { return value_coro(b); }
-[[clang::coro_wrapper]] Co wrapper2(const int& b) { return value_coro(b); 
}
+namespace member_coroutines{
+struct S {
+  Co member(const int& a) { co_return a; }  
+};
+
+Co use() {
+  S s;
+  int a = 1;
+  auto test1 = s.member(1);  // expected-warning {{temporary whose address is 
used as value of local variable}}
+  auto test2 = s.member(a);
+  auto test3 = S{}.member(a);  // expected-warning {{temporary whose address 
is used as value of local variable}}
+  co_return 1;
 }
 
+[[clang::coro_wrapper]] Co wrapper(const int& a) {
+  S s;
+  return s.member(a); // expected-warning {{address of stack memory}}
+}
+} // member_coroutines
+
 // 
=
 // Lifetime bound but not a Coroutine Return Type: No

[clang] [coroutines][coro_lifetimebound] Detect lifetime issues with lambda captures (PR #77066)

2024-01-18 Thread Utkarsh Saxena via cfe-commits


https://github.com/usx95 updated https://github.com/llvm/llvm-project/pull/77066

>From 3e0d0ab6c4fc6cba68285816a95e423bc18e8e55 Mon Sep 17 00:00:00 2001
From: Utkarsh Saxena 
Date: Fri, 5 Jan 2024 10:11:20 +0100
Subject: [PATCH 01/17] [coroutines] Detect lifetime issues with coroutine
 lambda captures

---
 clang/lib/Sema/SemaInit.cpp   | 20 +--
 clang/test/SemaCXX/coro-lifetimebound.cpp | 64 +--
 2 files changed, 76 insertions(+), 8 deletions(-)

diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 60c0e3e74204ec..c100bf11454786 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -12,6 +12,7 @@
 
 #include "clang/AST/ASTContext.h"
 #include "clang/AST/DeclObjC.h"
+#include "clang/AST/Expr.h"
 #include "clang/AST/ExprCXX.h"
 #include "clang/AST/ExprObjC.h"
 #include "clang/AST/ExprOpenMP.h"
@@ -33,6 +34,7 @@
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/Support/Casting.h"
 #include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/raw_ostream.h"
 
@@ -7575,15 +7577,27 @@ static void 
visitLifetimeBoundArguments(IndirectLocalPath &Path, Expr *Call,
 Path.pop_back();
   };
 
-  if (ObjectArg && implicitObjectParamIsLifetimeBound(Callee))
-VisitLifetimeBoundArg(Callee, ObjectArg);
-
   bool CheckCoroCall = false;
   if (const auto *RD = Callee->getReturnType()->getAsRecordDecl()) {
 CheckCoroCall = RD->hasAttr() &&
 RD->hasAttr() &&
 !Callee->hasAttr();
   }
+
+  if (ObjectArg) {
+bool CheckCoroObjArg = CheckCoroCall;
+// Ignore `__promise.get_return_object()` as it not lifetimebound.
+if (Callee->getDeclName().isIdentifier() &&
+Callee->getName() == "get_return_object")
+  CheckCoroObjArg = false;
+// Coroutine lambda objects with empty capture list are not lifetimebound.
+if (auto *LE = dyn_cast(ObjectArg->IgnoreImplicit());
+LE && LE->captures().empty())
+  CheckCoroObjArg = false;
+if (implicitObjectParamIsLifetimeBound(Callee) || CheckCoroObjArg)
+  VisitLifetimeBoundArg(Callee, ObjectArg);
+  }
+
   for (unsigned I = 0,
 N = std::min(Callee->getNumParams(), Args.size());
I != N; ++I) {
diff --git a/clang/test/SemaCXX/coro-lifetimebound.cpp 
b/clang/test/SemaCXX/coro-lifetimebound.cpp
index 3fc7ca70a14a12..319134450e4b6f 100644
--- a/clang/test/SemaCXX/coro-lifetimebound.cpp
+++ b/clang/test/SemaCXX/coro-lifetimebound.cpp
@@ -64,6 +64,10 @@ Co bar_coro(const int &b, int c) {
   : bar_coro(0, 1); // expected-warning {{returning address of local 
temporary object}}
 }
 
+// 
=
+// Lambdas
+// 
=
+namespace lambdas {
 void lambdas() {
   auto unsafe_lambda = [] [[clang::coro_wrapper]] (int b) {
 return foo_coro(b); // expected-warning {{address of stack memory 
associated with parameter}}
@@ -84,15 +88,47 @@ void lambdas() {
 co_return x + co_await foo_coro(b);
   };
 }
+
+Co lambda_captures() {
+  int a = 1;
+  // Temporary lambda object dies.
+  auto lamb = [a](int x, const int& y) -> Co { // expected-warning 
{{temporary whose address is used as value of local variable 'lamb'}}
+co_return x + y + a;
+  }(1, a);
+  // Object dies but it has no capture.
+  auto no_capture = []() -> Co { co_return 1; }();
+  auto bad_no_capture = [](const int& a) -> Co { co_return a; }(1); // 
expected-warning {{temporary}}
+  // Temporary lambda object with lifetime extension under co_await.
+  int res = co_await [a](int x, const int& y) -> Co {
+co_return x + y + a;
+  }(1, a);
+  co_return 1;
+}
+} // namespace lambdas
+
 // 
=
-// Safe usage when parameters are value
+// Member coroutines
 // 
=
-namespace by_value {
-Co value_coro(int b) { co_return co_await foo_coro(b); }
-[[clang::coro_wrapper]] Co wrapper1(int b) { return value_coro(b); }
-[[clang::coro_wrapper]] Co wrapper2(const int& b) { return value_coro(b); 
}
+namespace member_coroutines{
+struct S {
+  Co member(const int& a) { co_return a; }  
+};
+
+Co use() {
+  S s;
+  int a = 1;
+  auto test1 = s.member(1);  // expected-warning {{temporary whose address is 
used as value of local variable}}
+  auto test2 = s.member(a);
+  auto test3 = S{}.member(a);  // expected-warning {{temporary whose address 
is used as value of local variable}}
+  co_return 1;
 }
 
+[[clang::coro_wrapper]] Co wrapper(const int& a) {
+  S s;
+  return s.member(a); // expected-warning {{address of stack memory}}
+}
+} // member_coroutines
+
 // 
=
 // Lifetime bound but not a Coroutine Return Type: No

[clang] 667e58a - [coroutines][coro_lifetimebound] Detect lifetime issues with lambda captures (#77066)

2024-01-18 Thread via cfe-commits


Author: Utkarsh Saxena
Date: 2024-01-18T11:56:55+01:00
New Revision: 667e58a72e0d81abe0ab3500b5d5563b6a598e7f

URL: 
https://github.com/llvm/llvm-project/commit/667e58a72e0d81abe0ab3500b5d5563b6a598e7f
DIFF: 
https://github.com/llvm/llvm-project/commit/667e58a72e0d81abe0ab3500b5d5563b6a598e7f.diff

LOG: [coroutines][coro_lifetimebound] Detect lifetime issues with lambda 
captures (#77066)

### Problem

```cpp
co_task coro() {
int a = 1;
auto lamb = [a]() -> co_task {
co_return a; // 'a' in the lambda object dies after the iniital_suspend 
in the lambda coroutine.
}();
co_return co_await lamb;
}
```
[use-after-free](https://godbolt.org/z/GWPEovWWc)

Lambda captures (even by value) are prone to use-after-free once the
lambda object dies. In the above example, the lambda object appears only
as a temporary in the call expression. It dies after the first
suspension (`initial_suspend`) in the lambda.
On resumption in `co_await lamb`, the lambda accesses `a` which is part
of the already-dead lambda object.

---

### Solution

This problem can be formulated by saying that the `this` parameter of
the lambda call operator is a lifetimebound parameter. The lambda object
argument should therefore live atleast as long as the return object.
That said, this requirement does not hold if the lambda does not have a
capture list. In principle, the coroutine frame still has a reference to
a dead lambda object, but it is easy to see that the object would not be
used in the lambda-coroutine body due to no capture list.

It is safe to use this pattern inside a`co_await` expression due to the
lifetime extension of temporaries. Example:

```cpp
co_task coro() {
int a = 1;
int res = co_await [a]() -> co_task { co_return a; }();
co_return res;
}
```
---
### Background

This came up in the discussion with seastar folks on
[RFC](https://discourse.llvm.org/t/rfc-lifetime-bound-check-for-parameters-of-coroutines/74253/19?u=usx95).
This is a fairly common pattern in continuation-style-passing (CSP)
async programming involving futures and continuations. Document ["Lambda
coroutine
fiasco"](https://github.com/scylladb/seastar/blob/master/doc/lambda-coroutine-fiasco.md)
by Seastar captures the problem.
This pattern makes the migration from CSP-style async programming to
coroutines very bugprone.


Fixes https://github.com/llvm/llvm-project/issues/76995

-

Co-authored-by: Chuanqi Xu 

Added: 


Modified: 
clang/include/clang/Sema/Sema.h
clang/lib/Sema/SemaCoroutine.cpp
clang/lib/Sema/SemaDecl.cpp
clang/lib/Sema/SemaInit.cpp
clang/test/SemaCXX/coro-lifetimebound.cpp
clang/test/SemaCXX/coro-return-type-and-wrapper.cpp

Removed: 




diff  --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 6ce422d66ae5b0e..0db39333b0ee347 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -11249,6 +11249,11 @@ class Sema final {
   VarDecl *buildCoroutinePromise(SourceLocation Loc);
   void CheckCompletedCoroutineBody(FunctionDecl *FD, Stmt *&Body);
 
+  // Heuristically tells if the function is `get_return_object` member of a
+  // coroutine promise_type by matching the function name.
+  static bool CanBeGetReturnObject(const FunctionDecl *FD);
+  static bool CanBeGetReturnTypeOnAllocFailure(const FunctionDecl *FD);
+
   // As a clang extension, enforces that a non-coroutine function must be 
marked
   // with [[clang::coro_wrapper]] if it returns a type marked with
   // [[clang::coro_return_type]].

diff  --git a/clang/lib/Sema/SemaCoroutine.cpp 
b/clang/lib/Sema/SemaCoroutine.cpp
index bee80db8d166a68..0e0f8f67dcd73e1 100644
--- a/clang/lib/Sema/SemaCoroutine.cpp
+++ b/clang/lib/Sema/SemaCoroutine.cpp
@@ -16,6 +16,7 @@
 #include "CoroutineStmtBuilder.h"
 #include "clang/AST/ASTLambda.h"
 #include "clang/AST/Decl.h"
+#include "clang/AST/Expr.h"
 #include "clang/AST/ExprCXX.h"
 #include "clang/AST/StmtCXX.h"
 #include "clang/Basic/Builtins.h"

diff  --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 5472b43aafd4f39..eb28631ee6939fd 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -15912,13 +15912,26 @@ static void diagnoseImplicitlyRetainedSelf(Sema &S) {
   << FixItHint::CreateInsertion(P.first, "self->");
 }
 
+static bool methodHasName(const FunctionDecl *FD, StringRef Name) {
+  return isa(FD) && FD->param_empty() &&
+ FD->getDeclName().isIdentifier() && FD->getName().equals(Name);
+}
+
+bool Sema::CanBeGetReturnObject(const FunctionDecl *FD) {
+  return methodHasName(FD, "get_return_object");
+}
+
+bool Sema::CanBeGetReturnTypeOnAllocFailure(const FunctionDecl *FD) {
+  return FD->isStatic() &&
+ methodHasName(FD, "get_return_object_on_allocation_failure");
+}
+
 void Sema::CheckCoroutineWrapper(FunctionDecl *FD) {
   RecordDecl *RD = FD->getReturnType()->getAsRec

[libcxx] [llvm] [clang-tools-extra] [libc] [clang] [flang] [compiler-rt] [AMDGPU][GFX12] Add 16 bit atomic fadd instructions (PR #75917)

2024-01-18 Thread Mariusz Sikora via cfe-commits



@@ -1,56 +1,244 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -march=amdgcn -mcpu=gfx1200 -verify-machineinstrs < %s | FileCheck 
-check-prefix=GFX12 %s
-; RUN: llc -march=amdgcn -global-isel=1 -mcpu=gfx1200 -verify-machineinstrs < 
%s | FileCheck -check-prefix=GFX12 %s
+; RUN: llc -march=amdgcn -mcpu=gfx1200 -verify-machineinstrs < %s | FileCheck 
-check-prefix=GFX12-SDAG %s

mariusz-sikora-at-amd wrote:

Done

https://github.com/llvm/llvm-project/pull/75917
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [coroutines][coro_lifetimebound] Detect lifetime issues with lambda captures (PR #77066)

2024-01-18 Thread Utkarsh Saxena via cfe-commits


https://github.com/usx95 closed https://github.com/llvm/llvm-project/pull/77066
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[compiler-rt] [llvm] [clang] [lld] [libcxx] [clang-tools-extra] [lldb] [libc] [flang] [clang] Fix assertion failure with deleted overloaded unary operators (PR #78316)

2024-01-18 Thread Mariya Podchishchaeva via cfe-commits


https://github.com/Fznamznon updated 
https://github.com/llvm/llvm-project/pull/78316

>From cf33d7ce01aafe0fa29b8a38a9824a0b03d24f05 Mon Sep 17 00:00:00 2001
From: "Podchishchaeva, Mariya" 
Date: Tue, 16 Jan 2024 09:16:10 -0800
Subject: [PATCH 1/4] [clang] Fix assertion failure with deleted overloaded
 unary operators

When emitting notes related to wrong number of arguments do not consider
implicit object argument.

Fixes https://github.com/llvm/llvm-project/issues/78314
---
 clang/docs/ReleaseNotes.rst|  2 ++
 clang/lib/Sema/SemaOverload.cpp|  4 ++--
 clang/test/SemaCXX/overloaded-operator.cpp | 27 ++
 3 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 6e31849ce16dd4..8382e5d55f6c6e 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -750,6 +750,8 @@ Bug Fixes in This Version
   Fixes (`#77583 `_)
 - Fix an issue where CTAD fails for function-type/array-type arguments.
   Fixes (`#51710 `_)
+- Fixed assertion failure with deleted overloaded unary operators.
+  Fixes (`#78314 `_)
 
 Bug Fixes to Compiler Builtins
 ^^
diff --git a/clang/lib/Sema/SemaOverload.cpp b/clang/lib/Sema/SemaOverload.cpp
index 37c62b306b3cd3..83ab7cb0f3411b 100644
--- a/clang/lib/Sema/SemaOverload.cpp
+++ b/clang/lib/Sema/SemaOverload.cpp
@@ -14310,8 +14310,8 @@ Sema::CreateOverloadedUnaryOp(SourceLocation OpLoc, 
UnaryOperatorKind Opc,
 PartialDiagnosticAt(OpLoc, PDiag(diag::err_ovl_deleted_oper)
<< UnaryOperator::getOpcodeStr(Opc)
<< Input->getSourceRange()),
-*this, OCD_AllCandidates, ArgsArray, UnaryOperator::getOpcodeStr(Opc),
-OpLoc);
+*this, OCD_AllCandidates, ArgsArray.slice(1),
+UnaryOperator::getOpcodeStr(Opc), OpLoc);
 return ExprError();
   }
 
diff --git a/clang/test/SemaCXX/overloaded-operator.cpp 
b/clang/test/SemaCXX/overloaded-operator.cpp
index 83a7e65b43dd01..60332019f516cf 100644
--- a/clang/test/SemaCXX/overloaded-operator.cpp
+++ b/clang/test/SemaCXX/overloaded-operator.cpp
@@ -598,3 +598,30 @@ namespace B {
 }
 void g(B::X x) { A::f(x); }
 }
+
+namespace GH78314 {
+
+class a {
+public:
+  void operator--() = delete; // expected-note {{candidate function has been 
explicitly deleted}} \
+  // expected-note {{candidate function not 
viable: requires 0 arguments, but 1 was provided}}
+  void operator--(int) = delete; // expected-note {{candidate function has 
been explicitly deleted}} \
+ // expected-note {{candidate function not 
viable: requires 1 argument, but 0 were provided}}
+};
+
+void foo() {
+  a aa;
+  --aa; // expected-error {{overload resolution selected deleted operator 
'--'}}
+  aa--; // expected-error {{overload resolution selected deleted operator 
'--'}}
+}
+
+class b {
+  void operator++() = delete; // expected-note {{candidate function has been 
explicitly deleted}}
+  template  void operator++(int) { // expected-note {{function template 
not viable: requires 1 argument, but 0 were provided}}
+b bb;
+++bb; // expected-error {{overload resolution selected deleted operator 
'++'}}
+  }
+};
+
+
+}

>From 03daf97e74c05c1fa0c0c4b1637cbc76d3184404 Mon Sep 17 00:00:00 2001
From: "Podchishchaeva, Mariya" 
Date: Wed, 17 Jan 2024 02:30:04 -0800
Subject: [PATCH 2/4] Add a test with explicit object parameter

---
 clang/test/SemaCXX/overloaded-operator.cpp | 30 ++
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/clang/test/SemaCXX/overloaded-operator.cpp 
b/clang/test/SemaCXX/overloaded-operator.cpp
index 60332019f516cf..887848c29b83c5 100644
--- a/clang/test/SemaCXX/overloaded-operator.cpp
+++ b/clang/test/SemaCXX/overloaded-operator.cpp
@@ -1,4 +1,6 @@
-// RUN: %clang_cc1 -fsyntax-only -verify -std=c++11 %s 
+// RUN: %clang_cc1 -fsyntax-only -verify=expected,precxx23 -std=c++11 %s
+// RUN: %clang_cc1 -fsyntax-only -verify=expected,cxx23 -std=c++23 %s
+
 class X { };
 
 X operator+(X, X);
@@ -33,7 +35,9 @@ struct A {
 
 A make_A();
 
-bool operator==(A&, Z&); // expected-note 3{{candidate function}}
+bool operator==(A&, Z&); // expected-note 3{{candidate function}} \
+ // cxx23-note 2{{candidate function}}
+
 
 void h(A a, const A ac, Z z) {
   make_A() == z; // expected-warning{{equality comparison result unused}}
@@ -68,7 +72,9 @@ struct E2 {
 };
 
 // C++ [over.match.oper]p3 - enum restriction.
-float& operator==(E1, E2);  // expected-note{{candidate function}}
+float& operator==(E1, E2);  // expected-note{{candidate function}} \
+// cxx23-note{{candidate function}}
+
 
 void enum_test(Enum1 enum1,

[clang] [llvm] [clang-tools-extra] [Clang][C++23] Implement P2448R2: Relaxing some constexpr restrictions (PR #77753)

2024-01-18 Thread Mariya Podchishchaeva via cfe-commits


Fznamznon wrote:

@Endilll , are you ok with the changes?
BTW, do we want this for 18, or it is better to wait and merge after cutoff?

https://github.com/llvm/llvm-project/pull/77753
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [clang][NFC] Refactor `CXXNewExpr::InitializationStyle` (re-land) (PR #71417)

2024-01-18 Thread Balazs Benics via cfe-commits


steakhal wrote:

> Please let us know if this change is disruptive to you though, thanks!

I'm not really well versed about c++ initialization, so I asked my collage 
@tomasz-kaminski-sonarsource, to double-check how `CXXNewExpr` initialization 
is done per standard.
I'll try to summarize what we discussed:

The enum we had in the past described the syntax of the new expression. 
However, after the introduction of the `Implicit` style kind, this enum tries 
to encode the semantics aspect as well.

Note that the name of the previous kind `CallInit` was misleading, and it 
should have been called `ParenInit` denoting that in the spelling `(...)`'s 
were used. So, in the past, `InitializationStyle` did not try to encode whether 
or not an actual call would be present or not.

To illustrate this, here is a small example:
```c++
struct Derived {
  int data1;
  int data2;
};
void top() {
  // CURRENT STYLE, EXPECTED BEHAVIOR
  // -  -
  // const auto *p = new int(); // Call (aka. parens), good (zero inits)
  // const auto *p = new int; // None, good (uninitialized)
  // const auto *p = new Derived; // Implicit, but still leaves everything 
uninitialized
  // const auto *p = new int[10]; // None, good (uninitialized)
  // const auto *p = new Derived[10]; // Implicit, but still leaves everything 
uninitialized
}
```

https://github.com/llvm/llvm-project/pull/71417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang-tools-extra] [Clang][C++23] Implement P2448R2: Relaxing some constexpr restrictions (PR #77753)

2024-01-18 Thread Vlad Serebrennikov via cfe-commits


https://github.com/Endilll approved this pull request.


https://github.com/llvm/llvm-project/pull/77753
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Clang][SME] Detect always_inline used with mismatched streaming attributes (PR #77936)

2024-01-18 Thread Sam Tebbs via cfe-commits


https://github.com/SamTebbs33 updated 
https://github.com/llvm/llvm-project/pull/77936

>From 7314429a203900a8f555e1b0471fdd4cfd4d8d03 Mon Sep 17 00:00:00 2001
From: Samuel Tebbs 
Date: Wed, 10 Jan 2024 14:57:04 +
Subject: [PATCH 1/7] [Clang][SME] Detect always_inline used with mismatched
 streaming attributes

This patch adds an error that is emitted when a streaming function is
marked as always_inline and is called from a non-streaming function.
---
 .../clang/Basic/DiagnosticFrontendKinds.td|  2 ++
 clang/include/clang/Sema/Sema.h   |  9 +++
 clang/lib/CodeGen/CMakeLists.txt  |  1 +
 clang/lib/CodeGen/Targets/AArch64.cpp | 20 ++
 clang/lib/Sema/SemaChecking.cpp   | 27 +++
 ...-sme-func-attrs-inline-locally-streaming.c | 12 +
 .../aarch64-sme-func-attrs-inline-streaming.c | 12 +
 7 files changed, 66 insertions(+), 17 deletions(-)
 create mode 100644 
clang/test/CodeGen/aarch64-sme-func-attrs-inline-locally-streaming.c
 create mode 100644 clang/test/CodeGen/aarch64-sme-func-attrs-inline-streaming.c

diff --git a/clang/include/clang/Basic/DiagnosticFrontendKinds.td 
b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
index 85ecfdf9de62d4..2d0f971858840d 100644
--- a/clang/include/clang/Basic/DiagnosticFrontendKinds.td
+++ b/clang/include/clang/Basic/DiagnosticFrontendKinds.td
@@ -279,6 +279,8 @@ def err_builtin_needs_feature : Error<"%0 needs target 
feature %1">;
 def err_function_needs_feature : Error<
   "always_inline function %1 requires target feature '%2', but would "
   "be inlined into function %0 that is compiled without support for '%2'">;
+def err_function_alwaysinline_attribute_mismatch : Error<
+  "always_inline function %1 and its caller %0 have mismatched %2 attributes">;
 
 def warn_avx_calling_convention
 : Warning<"AVX vector %select{return|argument}0 of type %1 without '%2' "
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 6ce422d66ae5b0..dd75b5aad3d9c8 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -13832,8 +13832,17 @@ class Sema final {
 FormatArgumentPassingKind ArgPassingKind;
   };
 
+enum ArmStreamingType {
+  ArmNonStreaming,
+  ArmStreaming,
+  ArmStreamingCompatible,
+  ArmStreamingOrSVE2p1
+};
+
+
   static bool getFormatStringInfo(const FormatAttr *Format, bool IsCXXMember,
   bool IsVariadic, FormatStringInfo *FSI);
+  static ArmStreamingType getArmStreamingFnType(const FunctionDecl *FD);
 
 private:
   void CheckArrayAccess(const Expr *BaseExpr, const Expr *IndexExpr,
diff --git a/clang/lib/CodeGen/CMakeLists.txt b/clang/lib/CodeGen/CMakeLists.txt
index 52216d93a302bb..03a6f2f1d7a9d2 100644
--- a/clang/lib/CodeGen/CMakeLists.txt
+++ b/clang/lib/CodeGen/CMakeLists.txt
@@ -151,4 +151,5 @@ add_clang_library(clangCodeGen
   clangFrontend
   clangLex
   clangSerialization
+  clangSema
   )
diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp 
b/clang/lib/CodeGen/Targets/AArch64.cpp
index ee7f95084d2e0b..4018f91422e358 100644
--- a/clang/lib/CodeGen/Targets/AArch64.cpp
+++ b/clang/lib/CodeGen/Targets/AArch64.cpp
@@ -8,6 +8,8 @@
 
 #include "ABIInfoImpl.h"
 #include "TargetInfo.h"
+#include "clang/Basic/DiagnosticFrontend.h"
+#include "clang/Sema/Sema.h"
 
 using namespace clang;
 using namespace clang::CodeGen;
@@ -155,6 +157,11 @@ class AArch64TargetCodeGenInfo : public TargetCodeGenInfo {
 }
 return TargetCodeGenInfo::isScalarizableAsmOperand(CGF, Ty);
   }
+
+  void checkFunctionCallABI(CodeGenModule &CGM, SourceLocation CallLoc,
+const FunctionDecl *Caller,
+const FunctionDecl *Callee,
+const CallArgList &Args) const override;
 };
 
 class WindowsAArch64TargetCodeGenInfo : public AArch64TargetCodeGenInfo {
@@ -814,6 +821,19 @@ Address AArch64ABIInfo::EmitMSVAArg(CodeGenFunction &CGF, 
Address VAListAddr,
   /*allowHigherAlign*/ false);
 }
 
+void AArch64TargetCodeGenInfo::checkFunctionCallABI(
+CodeGenModule &CGM, SourceLocation CallLoc, const FunctionDecl *Caller,
+const FunctionDecl *Callee, const CallArgList &Args) const {
+if (!Callee->hasAttr())
+  return;
+
+auto CalleeIsStreaming = Sema::getArmStreamingFnType(Callee) == 
Sema::ArmStreaming;
+auto CallerIsStreaming = Sema::getArmStreamingFnType(Caller) == 
Sema::ArmStreaming;
+
+if (CalleeIsStreaming && !CallerIsStreaming)
+CGM.getDiags().Report(CallLoc, 
diag::err_function_alwaysinline_attribute_mismatch) << Caller->getDeclName() << 
Callee->getDeclName() << "streaming";
+}
+
 std::unique_ptr
 CodeGen::createAArch64TargetCodeGenInfo(CodeGenModule &CGM,
 AArch64ABIKind Kind) {
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index ace3e386988f00..a92db7d67e1cbd 100644
--- a/clang/lib/Sem

[clang] [llvm] [Clang][SME] Detect always_inline used with mismatched streaming attributes (PR #77936)

2024-01-18 Thread Sam Tebbs via cfe-commits



@@ -3145,7 +3138,7 @@ bool Sema::ParseSVEImmChecks(
   return HasError;
 }
 
-static ArmStreamingType getArmStreamingFnType(const FunctionDecl *FD) {
+Sema::ArmStreamingType Sema::getArmStreamingFnType(const FunctionDecl *FD) {

SamTebbs33 wrote:

Done.

https://github.com/llvm/llvm-project/pull/77936
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [Clang][SME] Detect always_inline used with mismatched streaming attributes (PR #77936)

2024-01-18 Thread Sam Tebbs via cfe-commits



@@ -812,6 +819,24 @@ Address AArch64ABIInfo::EmitMSVAArg(CodeGenFunction &CGF, 
Address VAListAddr,
   /*allowHigherAlign*/ false);
 }
 
+void AArch64TargetCodeGenInfo::checkFunctionCallABI(
+CodeGenModule &CGM, SourceLocation CallLoc, const FunctionDecl *Caller,
+const FunctionDecl *Callee, const CallArgList &Args) const {
+  if (!Callee->hasAttr())
+return;
+
+  auto CalleeStreamingMode = Sema::getArmStreamingFnType(Callee);
+  auto CallerStreamingMode = Sema::getArmStreamingFnType(Caller);
+
+  // The caller can inline the callee if their streaming modes match or the
+  // callee is streaming compatible
+  if (CalleeStreamingMode != CallerStreamingMode &&

SamTebbs33 wrote:

Done, thanks for the idea.

https://github.com/llvm/llvm-project/pull/77936
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [Clang][SME] Detect always_inline used with mismatched streaming attributes (PR #77936)

2024-01-18 Thread Sam Tebbs via cfe-commits



@@ -153,6 +155,11 @@ class AArch64TargetCodeGenInfo : public TargetCodeGenInfo {
 }
 return TargetCodeGenInfo::isScalarizableAsmOperand(CGF, Ty);
   }
+
+  void checkFunctionCallABI(CodeGenModule &CGM, SourceLocation CallLoc,

SamTebbs33 wrote:

It's called from CodeGenFunction::EmitCall in CGCall.cpp. x86 implements this 
function as well.

https://github.com/llvm/llvm-project/pull/77936
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang-tools-extra] [Clang][C++23] Implement P2448R2: Relaxing some constexpr restrictions (PR #77753)

2024-01-18 Thread via cfe-commits


cor3ntin wrote:

@Fznamznon I think that unless bots find something, this is good for 18. I 
don't see anything that concerns me

https://github.com/llvm/llvm-project/pull/77753
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [clang][NFC] Refactor `CXXNewExpr::InitializationStyle` (re-land) (PR #71417)

2024-01-18 Thread Vlad Serebrennikov via cfe-commits


Endilll wrote:

> The enum we had in the past described the syntax of the new expression.

Even if it was the case at some point, I'm not sure it held when I created the 
PR, which eliminated this kind of nasty mapping, encoding how this enum was 
actually used:
```cpp
 CXXNewExprBits.StoredInitializationStyle =
  Initializer ? InitializationStyle + 1 : 0;
```
```cpp
  InitializationStyle getInitializationStyle() const {
if (CXXNewExprBits.StoredInitializationStyle == 0)
  return NoInit;
return static_cast(
CXXNewExprBits.StoredInitializationStyle - 1);
```

https://github.com/llvm/llvm-project/pull/71417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Clang][SME] Detect always_inline used with mismatched streaming attributes (PR #77936)

2024-01-18 Thread Sander de Smalen via cfe-commits


sdesmalen-arm wrote:

> There's no general rule that forbids taking the address of an always_inline 
> function. So if a user really wants to, they can call a mismatched 
> always_inline function anyway. Given that, making this a hard error seems a 
> bit dubious; it should probably be a warning instead.

FWIW, the GNU documentation on `always_inline` says:
> Failure to inline such a function is diagnosed as an error. Note that if such 
> a function is called indirectly the compiler may or may not inline it 
> depending on optimization level and a failure to inline an indirect call may 
> or may not be diagnosed.

Clang and GCC emit an error for e.g. mismatching target attributes: 
https://godbolt.org/z/ddThn67sa

https://github.com/llvm/llvm-project/pull/77936
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [libc] [openmp] [llvm] [clang] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-18 Thread Piotr Sobczak via cfe-commits


piotrAMD wrote:

Rebased and regenerated lit tests after GFX12 waitcnt codegen changes.

https://github.com/llvm/llvm-project/pull/2
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Implement CWG2598: Union of non-literal types (PR #78195)

2024-01-18 Thread via cfe-commits


groundswellaudio wrote:

@Fznamznon 
@cor3ntin 
So https://github.com/llvm/llvm-project/pull/77753 doesn't solve the problem of 
an anonymous union not being literal when it should be, right? 

https://github.com/llvm/llvm-project/pull/78195
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] [NFC] Remove default argument in ASTUnit.h (PR #78566)

2024-01-18 Thread via cfe-commits


https://github.com/Sirraide created 
https://github.com/llvm/llvm-project/pull/78566

This removes a default argument that is currently broken in C++23 mode due to 
`std::default_delete` now being `constexpr`. This is a known problem (see 
#74963, #59966, #69996, and a couple more), fixing which will probably take 
some time, so this at least makes it possible to compile `ASTUnit.h` in C++23 
mode.

Note that we can’t simply include the header that provides the definition of 
the class causing the problem either, as that would create a circular 
dependency.

>From 414b7c3275b3dd71405e73f5f9ed43cfcf4421d7 Mon Sep 17 00:00:00 2001
From: Sirraide 
Date: Thu, 18 Jan 2024 12:55:41 +0100
Subject: [PATCH] [NFC] Remove default argument

---
 clang/include/clang/Frontend/ASTUnit.h| 2 +-
 clang/tools/libclang/CIndexCodeCompletion.cpp | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/clang/include/clang/Frontend/ASTUnit.h 
b/clang/include/clang/Frontend/ASTUnit.h
index fe99b3d5adbfa0a..6af712afdcb6d8d 100644
--- a/clang/include/clang/Frontend/ASTUnit.h
+++ b/clang/include/clang/Frontend/ASTUnit.h
@@ -902,7 +902,7 @@ class ASTUnit {
 SourceManager &SourceMgr, FileManager &FileMgr,
 SmallVectorImpl &StoredDiagnostics,
 SmallVectorImpl &OwnedBuffers,
-std::unique_ptr Act = nullptr);
+std::unique_ptr Act);
 
   /// Save this translation unit to a file with the given name.
   ///
diff --git a/clang/tools/libclang/CIndexCodeCompletion.cpp 
b/clang/tools/libclang/CIndexCodeCompletion.cpp
index 196c64e61722746..3c5f390f6d888a9 100644
--- a/clang/tools/libclang/CIndexCodeCompletion.cpp
+++ b/clang/tools/libclang/CIndexCodeCompletion.cpp
@@ -765,7 +765,8 @@ clang_codeCompleteAt_Impl(CXTranslationUnit TU, const char 
*complete_filename,
 IncludeBriefComments, Capture,
 CXXIdx->getPCHContainerOperations(), *Results->Diag,
 Results->LangOpts, *Results->SourceMgr, *Results->FileMgr,
-Results->Diagnostics, Results->TemporaryBuffers);
+Results->Diagnostics, Results->TemporaryBuffers,
+/*SyntaxOnlyAction=*/nullptr);
 
   Results->DiagnosticsWrappers.resize(Results->Diagnostics.size());
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [llvm] [clang] [VPlan] Introduce VPSingleDefRecipe. (PR #77023)

2024-01-18 Thread via cfe-commits



@@ -819,10 +812,77 @@ class VPRecipeBase : public 
ilist_node_with_parent,
   }
\
   static inline bool classof(const VPRecipeBase *R) {  
\
 return R->getVPDefID() == VPDefID; 
\
+  }
\
+  static inline bool classof(const VPSingleDefRecipe *R) { 
\
+return R->getVPDefID() == VPDefID; 
\
   }
 
+/// VPSingleDef is a base class for recipes for modeling a sequence of one or
+/// more output IR that define a single result VPValue.
+class VPSingleDefRecipe : public VPRecipeBase, public VPValue {
+public:
+  template 
+  VPSingleDefRecipe(const unsigned char SC, IterT Operands, DebugLoc DL = {})
+  : VPRecipeBase(SC, Operands, DL), VPValue(this) {}
+
+  VPSingleDefRecipe(const unsigned char SC, ArrayRef Operands,
+DebugLoc DL = {})
+  : VPRecipeBase(SC, Operands, DL), VPValue(this) {}
+
+  template 
+  VPSingleDefRecipe(const unsigned char SC, IterT Operands, Value *UV,
+DebugLoc DL = {})
+  : VPRecipeBase(SC, Operands, DL), VPValue(this, UV) {}
+
+  static inline bool classof(const VPRecipeBase *R) {
+switch (R->getVPDefID()) {
+case VPRecipeBase::VPDerivedIVSC:
+case VPRecipeBase::VPExpandSCEVSC:
+case VPRecipeBase::VPInstructionSC:
+case VPRecipeBase::VPReductionSC:
+case VPRecipeBase::VPReplicateSC:
+case VPRecipeBase::VPScalarIVStepsSC:
+case VPRecipeBase::VPVectorPointerSC:
+case VPRecipeBase::VPWidenCallSC:
+case VPRecipeBase::VPWidenCanonicalIVSC:
+case VPRecipeBase::VPWidenCastSC:
+case VPRecipeBase::VPWidenGEPSC:
+case VPRecipeBase::VPWidenSC:
+case VPRecipeBase::VPWidenSelectSC:
+case VPRecipeBase::VPBlendSC:
+case VPRecipeBase::VPPredInstPHISC:
+case VPRecipeBase::VPCanonicalIVPHISC:
+case VPRecipeBase::VPActiveLaneMaskPHISC:
+case VPRecipeBase::VPFirstOrderRecurrencePHISC:
+case VPRecipeBase::VPWidenPHISC:
+case VPRecipeBase::VPWidenIntOrFpInductionSC:
+case VPRecipeBase::VPWidenPointerInductionSC:
+case VPRecipeBase::VPReductionPHISC:
+  return true;
+case VPRecipeBase::VPInterleaveSC:
+case VPRecipeBase::VPBranchOnMaskSC:
+case VPRecipeBase::VPWidenMemoryInstructionSC:
+  return false;
+}
+llvm_unreachable("Unhandled VPDefID");
+  }
+
+  static inline bool classof(const VPUser *U) {
+auto *R = dyn_cast(U);
+return R && classof(R);
+  }
+
+  /// Returns the underlying instruction.
+  Instruction *getUnderlyingInstr() {
+return cast(getUnderlyingValue());

ayalz wrote:

Must be non null (currently)? Same for const version below.

https://github.com/llvm/llvm-project/pull/77023
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang-tools-extra] [VPlan] Introduce VPSingleDefRecipe. (PR #77023)

2024-01-18 Thread via cfe-commits


https://github.com/ayalz commented:

Some nits and thoughts.

https://github.com/llvm/llvm-project/pull/77023
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clang] [llvm] [VPlan] Introduce VPSingleDefRecipe. (PR #77023)

2024-01-18 Thread via cfe-commits


https://github.com/ayalz edited https://github.com/llvm/llvm-project/pull/77023
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang-tools-extra] [VPlan] Introduce VPSingleDefRecipe. (PR #77023)

2024-01-18 Thread via cfe-commits



@@ -12,6 +12,8 @@
 ///VPBlockBase, together implementing a Hierarchical CFG;
 /// 2. Pure virtual VPRecipeBase serving as the base class for recipes 
contained
 ///within VPBasicBlocks;
+/// 3. Pure virtual VPSingleDefRecipe serving as a base class for recipes that
+///also inherit from VPValue.
 /// 3. VPInstruction, a concrete Recipe and VPUser modeling a single planned
 ///instruction;
 /// 4. The VPlan class holding a candidate for vectorization;

ayalz wrote:

also renumber 5 to 6 next, but this unfortunately cannot be commented directly 
- falling more than 3 lines away from patched code:
```suggestion
/// 4. VPInstruction, a concrete Recipe and VPUser modeling a single planned
///instruction;
/// 5. The VPlan class holding a candidate for vectorization;
```


https://github.com/llvm/llvm-project/pull/77023
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang-tools-extra] [VPlan] Introduce VPSingleDefRecipe. (PR #77023)

2024-01-18 Thread via cfe-commits



@@ -912,25 +970,28 @@ class VPRecipeWithIRFlags : public VPRecipeBase {
 } else if (auto *Op = dyn_cast(&I)) {
   OpType = OperationType::FPMathOp;
   FMFs = Op->getFastMathFlags();
+} else {
+  OpType = OperationType::Other;
+  AllFlags = 0;

ayalz wrote:

Future thought: wonder if VPRecipeWithIRFlags should be merged with 
VPRecipeBase, as effectively this allows the former to indicate it is w/o 
IRFlags. I.e., such flags are optional.

https://github.com/llvm/llvm-project/pull/77023
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [clang-tools-extra] [VPlan] Introduce VPSingleDefRecipe. (PR #77023)

2024-01-18 Thread via cfe-commits



@@ -819,10 +812,77 @@ class VPRecipeBase : public 
ilist_node_with_parent,
   }
\
   static inline bool classof(const VPRecipeBase *R) {  
\
 return R->getVPDefID() == VPDefID; 
\
+  }
\
+  static inline bool classof(const VPSingleDefRecipe *R) { 
\
+return R->getVPDefID() == VPDefID; 
\
   }
 
+/// VPSingleDef is a base class for recipes for modeling a sequence of one or
+/// more output IR that define a single result VPValue.
+class VPSingleDefRecipe : public VPRecipeBase, public VPValue {
+public:
+  template 
+  VPSingleDefRecipe(const unsigned char SC, IterT Operands, DebugLoc DL = {})
+  : VPRecipeBase(SC, Operands, DL), VPValue(this) {}
+
+  VPSingleDefRecipe(const unsigned char SC, ArrayRef Operands,
+DebugLoc DL = {})
+  : VPRecipeBase(SC, Operands, DL), VPValue(this) {}
+
+  template 
+  VPSingleDefRecipe(const unsigned char SC, IterT Operands, Value *UV,
+DebugLoc DL = {})
+  : VPRecipeBase(SC, Operands, DL), VPValue(this, UV) {}
+
+  static inline bool classof(const VPRecipeBase *R) {
+switch (R->getVPDefID()) {
+case VPRecipeBase::VPDerivedIVSC:
+case VPRecipeBase::VPExpandSCEVSC:
+case VPRecipeBase::VPInstructionSC:
+case VPRecipeBase::VPReductionSC:
+case VPRecipeBase::VPReplicateSC:
+case VPRecipeBase::VPScalarIVStepsSC:
+case VPRecipeBase::VPVectorPointerSC:
+case VPRecipeBase::VPWidenCallSC:
+case VPRecipeBase::VPWidenCanonicalIVSC:
+case VPRecipeBase::VPWidenCastSC:
+case VPRecipeBase::VPWidenGEPSC:
+case VPRecipeBase::VPWidenSC:
+case VPRecipeBase::VPWidenSelectSC:
+case VPRecipeBase::VPBlendSC:
+case VPRecipeBase::VPPredInstPHISC:
+case VPRecipeBase::VPCanonicalIVPHISC:
+case VPRecipeBase::VPActiveLaneMaskPHISC:
+case VPRecipeBase::VPFirstOrderRecurrencePHISC:
+case VPRecipeBase::VPWidenPHISC:
+case VPRecipeBase::VPWidenIntOrFpInductionSC:
+case VPRecipeBase::VPWidenPointerInductionSC:
+case VPRecipeBase::VPReductionPHISC:
+  return true;
+case VPRecipeBase::VPInterleaveSC:
+case VPRecipeBase::VPBranchOnMaskSC:
+case VPRecipeBase::VPWidenMemoryInstructionSC:

ayalz wrote:

A widen store defines no value, so is not a VPSingleDefRecipe, but a widen load 
should be?
Can leave behind a TODO to address later, as part of splitting 
VPWidenMemoryInstruction between stores and loads (potentially having a common 
pure virtual base class to hold common stuff).

https://github.com/llvm/llvm-project/pull/77023
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [clang-tools-extra] [VPlan] Introduce VPSingleDefRecipe. (PR #77023)

2024-01-18 Thread via cfe-commits



@@ -819,10 +812,77 @@ class VPRecipeBase : public 
ilist_node_with_parent,
   }
\
   static inline bool classof(const VPRecipeBase *R) {  
\
 return R->getVPDefID() == VPDefID; 
\
+  }
\
+  static inline bool classof(const VPSingleDefRecipe *R) { 
\
+return R->getVPDefID() == VPDefID; 
\
   }
 
+/// VPSingleDef is a base class for recipes for modeling a sequence of one or
+/// more output IR that define a single result VPValue.

ayalz wrote:

/// Note that VPRecipeBase must be inherited from before VPValue.

https://github.com/llvm/llvm-project/pull/77023
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] [NFC] Remove default argument in ASTUnit.h (PR #78566)

2024-01-18 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (Sirraide)


Changes

This removes a default argument that is currently broken in C++23 mode due to 
`std::default_delete` now being `constexpr`. This is a known problem (see #74963, #59966, #69996, and a couple more), fixing which will 
probably take some time, so this at least makes it possible to compile 
`ASTUnit.h` in C++23 mode.

Note that we can’t simply include the header that provides the definition of 
the class causing the problem either, as that would create a circular 
dependency.

---
Full diff: https://github.com/llvm/llvm-project/pull/78566.diff


2 Files Affected:

- (modified) clang/include/clang/Frontend/ASTUnit.h (+1-1) 
- (modified) clang/tools/libclang/CIndexCodeCompletion.cpp (+2-1) 


``diff
diff --git a/clang/include/clang/Frontend/ASTUnit.h 
b/clang/include/clang/Frontend/ASTUnit.h
index fe99b3d5adbfa0..6af712afdcb6d8 100644
--- a/clang/include/clang/Frontend/ASTUnit.h
+++ b/clang/include/clang/Frontend/ASTUnit.h
@@ -902,7 +902,7 @@ class ASTUnit {
 SourceManager &SourceMgr, FileManager &FileMgr,
 SmallVectorImpl &StoredDiagnostics,
 SmallVectorImpl &OwnedBuffers,
-std::unique_ptr Act = nullptr);
+std::unique_ptr Act);
 
   /// Save this translation unit to a file with the given name.
   ///
diff --git a/clang/tools/libclang/CIndexCodeCompletion.cpp 
b/clang/tools/libclang/CIndexCodeCompletion.cpp
index 196c64e6172274..3c5f390f6d888a 100644
--- a/clang/tools/libclang/CIndexCodeCompletion.cpp
+++ b/clang/tools/libclang/CIndexCodeCompletion.cpp
@@ -765,7 +765,8 @@ clang_codeCompleteAt_Impl(CXTranslationUnit TU, const char 
*complete_filename,
 IncludeBriefComments, Capture,
 CXXIdx->getPCHContainerOperations(), *Results->Diag,
 Results->LangOpts, *Results->SourceMgr, *Results->FileMgr,
-Results->Diagnostics, Results->TemporaryBuffers);
+Results->Diagnostics, Results->TemporaryBuffers,
+/*SyntaxOnlyAction=*/nullptr);
 
   Results->DiagnosticsWrappers.resize(Results->Diagnostics.size());
 

``




https://github.com/llvm/llvm-project/pull/78566
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] [NFC] Remove default argument in ASTUnit.h (PR #78566)

2024-01-18 Thread via cfe-commits


Sirraide wrote:

CC @AaronBallman 

https://github.com/llvm/llvm-project/pull/78566
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] [NFC] Remove default argument in ASTUnit.h (PR #78566)

2024-01-18 Thread via cfe-commits


Sirraide wrote:

Note sure why that one CI job got cancelled candidly, but I don’t think I can 
re-run it.

https://github.com/llvm/llvm-project/pull/78566
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [clang-tools-extra] [VPlan] Introduce VPSingleDefRecipe. (PR #77023)

2024-01-18 Thread via cfe-commits


https://github.com/ayalz approved this pull request.

Looks good to me, with the above nits.

https://github.com/llvm/llvm-project/pull/77023
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [clang-tools-extra] [LoopVectorize] Refine runtime memory check costs when there is an outer loop (PR #76034)

2024-01-18 Thread Rin Dobrescu via cfe-commits



@@ -2076,16 +2081,61 @@ class GeneratedRTChecks {
 LLVM_DEBUG(dbgs() << "  " << C << "  for " << I << "\n");
 RTCheckCost += C;
   }
-if (MemCheckBlock)
+if (MemCheckBlock) {
+  InstructionCost MemCheckCost = 0;
   for (Instruction &I : *MemCheckBlock) {
 if (MemCheckBlock->getTerminator() == &I)
   continue;
 InstructionCost C =
 TTI->getInstructionCost(&I, TTI::TCK_RecipThroughput);
 LLVM_DEBUG(dbgs() << "  " << C << "  for " << I << "\n");
-RTCheckCost += C;
+MemCheckCost += C;
   }
 
+  // If the runtime memory checks are being created inside an outer loop
+  // we should find out if these checks are outer loop invariant. If so,
+  // the checks will likely be hoisted out and so the effective cost will
+  // reduce according to the outer loop trip count.
+  if (OuterLoop) {
+ScalarEvolution *SE = MemCheckExp.getSE();
+// TODO: We could refine this further by analysing every individual
+// memory check, since there could be a mixture of loop variant and
+// invariant checks that mean the final condition is variant. However,
+// I think it would need further analysis to prove this is beneficial.
+const SCEV *Cond = SE->getSCEV(MemRuntimeCheckCond);
+if (SE->isLoopInvariant(Cond, OuterLoop)) {
+  // It seems reasonable to assume that we can reduce the effective
+  // cost of the checks even when we know nothing about the trip
+  // count. Here I've assumed that the outer loop executes at least
+  // twice.
+  unsigned BestTripCount = 2;
+
+  // If exact trip count is known use that.
+  if (unsigned SmallTC = SE->getSmallConstantTripCount(OuterLoop))
+BestTripCount = SmallTC;
+  else if (LoopVectorizeWithBlockFrequency) {
+// Else use profile data if available.
+if (auto EstimatedTC = getLoopEstimatedTripCount(OuterLoop))
+  BestTripCount = *EstimatedTC;
+  }
+
+  InstructionCost NewMemCheckCost = MemCheckCost / BestTripCount;
+
+  // Let's ensure the cost is always at least 1.
+  NewMemCheckCost = std::max(*NewMemCheckCost.getValue(), (long)1);

Rin18 wrote:

There's a buildbot failure at this line. Has that been fixed? Might be worth 
getting that triggered again.

https://github.com/llvm/llvm-project/pull/76034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [clang-tools-extra] [LoopVectorize] Refine runtime memory check costs when there is an outer loop (PR #76034)

2024-01-18 Thread Rin Dobrescu via cfe-commits


https://github.com/Rin18 edited https://github.com/llvm/llvm-project/pull/76034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang-tools-extra] [LoopVectorize] Refine runtime memory check costs when there is an outer loop (PR #76034)

2024-01-18 Thread Rin Dobrescu via cfe-commits


https://github.com/Rin18 commented:

One small comment, but otherwise LGTM! I'll leave someone else more familiar 
with the code to approve the change.

https://github.com/llvm/llvm-project/pull/76034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][Diagnostics] Highlight code snippets (PR #66514)

2024-01-18 Thread via cfe-commits


cor3ntin wrote:

That looks reasonable.
I really think we should land this, with checkpoints early in the clang 19 
cycle to get actual data.
The only thing we need to make sure is that highlighting isn't done in CI/ 
redirected output

https://github.com/llvm/llvm-project/pull/66514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-18 Thread Mirko Brkušanin via cfe-commits



@@ -423,6 +423,67 @@ TARGET_BUILTIN(__builtin_amdgcn_s_wakeup_barrier, "vi", 
"n", "gfx12-insts")
 TARGET_BUILTIN(__builtin_amdgcn_s_barrier_leave, "b", "n", "gfx12-insts")
 TARGET_BUILTIN(__builtin_amdgcn_s_get_barrier_state, "Uii", "n", "gfx12-insts")
 
+//===--===//
+// WMMA builtins.
+// Postfix w32 indicates the builtin requires wavefront size of 32.
+// Postfix w64 indicates the builtin requires wavefront size of 64.
+//
+// Some of these are very similar to their GFX11 counterparts, but they don't
+// require replication of the A,B matrices, so they use fewer vector elements.
+// Therefore, we add an "_gfx12" suffix to distinguish them from the existing
+// builtins.
+//===--===//
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_f16_w32_gfx12, 
"V8fV8hV8hV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf16_w32_gfx12, 
"V8fV8sV8sV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f16_16x16x16_f16_w32_gfx12, 
"V8hV8hV8hV8h", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32_gfx12, 
"V8sV8sV8sV8s", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32_gfx12, 
"V8iIbV2iIbV2iV8iIb", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32_gfx12, 
"V8iIbiIbiV8iIb", "nc", "gfx12-insts,wavefrontsize32")
+// These are gfx12-only, but for consistency with the other WMMA variants we're
+// keeping the "_gfx12" suffix.
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_fp8_fp8_w32_gfx12, 
"V8fV2iV2iV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_fp8_bf8_w32_gfx12, 
"V8fV2iV2iV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf8_fp8_w32_gfx12, 
"V8fV2iV2iV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf8_bf8_w32_gfx12, 
"V8fV2iV2iV8f", "nc", "gfx12-insts,wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_i32_16x16x32_iu4_w32_gfx12, 
"V8iIbV2iIbV2iV8iIb", "nc", "gfx12-insts,wavefrontsize32")
+
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_f16_w64_gfx12, 
"V4fV4hV4hV4f", "nc", "gfx12-insts,wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_wmma_f32_16x16x16_bf16_w64_gfx12, 
"V4fV4sV4sV4f", "nc", "gfx12-insts,wavefrontsize64")

mbrkusanin wrote:

Updated to bfloat but GlobalISel does not handle it properly yet. Should we use 
i16 for now until we update GlobalISel?

https://github.com/llvm/llvm-project/pull/77795
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [llvm] [lld] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits



@@ -4135,6 +4283,33 @@ Code object V5 metadata is the same as
 
  == == = 

 
+.. _amdgpu-amdhsa-code-object-metadata-v6:
+
+Code Object V6 Metadata

+
+.. warning::
+  Code object V6 is not the default code object version emitted by this version
+  of LLVM.
+
+
+Code object V6 metadata is the same as
+:ref:`amdgpu-amdhsa-code-object-metadata-v5` with the changes defined in table
+:ref:`amdgpu-amdhsa-code-object-metadata-map-table-v6`.
+
+  .. table:: AMDHSA Code Object V6 Metadata Map Changes
+ :name: amdgpu-amdhsa-code-object-metadata-map-table-v6
+
+ = == = 
===
+ String KeyValue Type Required? Description
+ = == = 
===
+ "amdhsa.version"  sequence ofRequired  - The first integer is the 
major

Pierre-vh wrote:

I anticipate that we'll want to add some more V6-only metadata at some point, 
that's why I just started a new  table so it's easier to follow up. I don't 
mind merging it with the V5 table if you really prefer

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [lld] [flang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits


Pierre-vh wrote:

I added a few more tests, I just didn't find how to test the flat-scratch stuff 
properly.
Also, gfx904 is documented as not having absolute flat scratch, yet I don't see 
anything about that in the code (no related feature). I put gfx9-generic with 
flat scratch but I don't know if that's correct at all, and how to test it?

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][Diagnostics] Highlight code snippets (PR #66514)

2024-01-18 Thread Timm Baeder via cfe-commits


tbaederr wrote:

> The only thing we need to make sure is that highlighting isn't done in CI/ 
> redirected output

It now respects `DiagOpts->ShowColors`, so that should work. But I guess you're 
talking about a test. :)



https://github.com/llvm/llvm-project/pull/66514
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [RISCV] Add support for new unprivileged extensions defined in profiles spec (PR #77458)

2024-01-18 Thread Luke Lau via cfe-commits


https://github.com/lukel97 updated 
https://github.com/llvm/llvm-project/pull/77458

>From fb8eebe1c7f5b4dec812c64d9a2572a98d59bdb8 Mon Sep 17 00:00:00 2001
From: Luke Lau 
Date: Tue, 9 Jan 2024 19:42:10 +0700
Subject: [PATCH 1/7] [RISCV] Add support for new unprivileged extensions
 defined in profiles spec

This adds minimal support for 7 new extensions that were defined as a part of
the RISC-V Profiles specification here:
https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#7-new-isa-extensions

As stated in the specification, these extensions don't add any new features but
describe existing features. So this patch only adds parsing and subtarget
features.
---
 llvm/lib/Support/RISCVISAInfo.cpp  |  7 +++
 llvm/lib/Target/RISCV/RISCVFeatures.td | 26 ++
 llvm/test/CodeGen/RISCV/attributes.ll  | 14 ++
 3 files changed, 47 insertions(+)

diff --git a/llvm/lib/Support/RISCVISAInfo.cpp 
b/llvm/lib/Support/RISCVISAInfo.cpp
index d991878a5f1eca..8c9eb1bddb3cb5 100644
--- a/llvm/lib/Support/RISCVISAInfo.cpp
+++ b/llvm/lib/Support/RISCVISAInfo.cpp
@@ -88,6 +88,8 @@ static const RISCVSupportedExtension SupportedExtensions[] = {
 {"xtheadvdot", {1, 0}},
 {"xventanacondops", {1, 0}},
 
+{"za128rs", {1, 0}},
+{"za64rs", {1, 0}},
 {"zawrs", {1, 0}},
 
 {"zba", {1, 0}},
@@ -116,9 +118,14 @@ static const RISCVSupportedExtension SupportedExtensions[] 
= {
 {"zhinx", {1, 0}},
 {"zhinxmin", {1, 0}},
 
+{"zic64b", {1, 0}},
 {"zicbom", {1, 0}},
 {"zicbop", {1, 0}},
 {"zicboz", {1, 0}},
+{"ziccamoa", {1, 0}},
+{"ziccif", {1, 0}},
+{"zicclsm", {1, 0}},
+{"ziccrse", {1, 0}},
 {"zicntr", {2, 0}},
 {"zicsr", {2, 0}},
 {"zifencei", {2, 0}},
diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td 
b/llvm/lib/Target/RISCV/RISCVFeatures.td
index fa334c69ddc982..1946f2253fa6c0 100644
--- a/llvm/lib/Target/RISCV/RISCVFeatures.td
+++ b/llvm/lib/Target/RISCV/RISCVFeatures.td
@@ -93,6 +93,22 @@ def HasStdExtZifencei : 
Predicate<"Subtarget->hasStdExtZifencei()">,
AssemblerPredicate<(all_of 
FeatureStdExtZifencei),
"'Zifencei' (fence.i)">;
 
+def FeatureStdExtZiccamoa
+: SubtargetFeature<"ziccamoa", "HasStdExtZiccamoa", "true",
+   "'Ziccamoa' (Main Memory Supports All Atomics in A)">;
+
+def FeatureStdExtZiccif
+: SubtargetFeature<"ziccif", "HasStdExtZiccif", "true",
+   "'Ziccif' (Main Memory Supports Instruction Fetch with 
Atomicity Requirement)">;
+
+def FeatureStdExtZicclsm
+: SubtargetFeature<"zicclsm", "HasStdExtZicclsm", "true",
+   "'Zicclsm' (Main Memory Supports Misaligned 
Loads/Stores)">;
+
+def FeatureStdExtZiccrse
+: SubtargetFeature<"ziccrse", "HasStdExtZiccrse", "true",
+   "'Ziccrse' (Main Memory Supports Forward Progress on 
LR/SC Sequences)">;
+
 def FeatureStdExtZicntr
 : SubtargetFeature<"zicntr", "HasStdExtZicntr", "true",
"'Zicntr' (Base Counters and Timers)",
@@ -517,6 +533,10 @@ def HasStdExtZfhOrZvfh
"'Zfh' (Half-Precision Floating-Point) or "
"'Zvfh' (Vector Half-Precision 
Floating-Point)">;
 
+def FeatureStdExtZic64b
+: SubtargetFeature<"zic64b", "HasStdExtZic64b", "true",
+   "'Zic64b' (Cache Block Size Is 64 Bytes)">;
+
 def FeatureStdExtZicbom
 : SubtargetFeature<"zicbom", "HasStdExtZicbom", "true",
"'Zicbom' (Cache-Block Management Instructions)">;
@@ -561,6 +581,12 @@ def HasStdExtZtso : 
Predicate<"Subtarget->hasStdExtZtso()">,
   "'Ztso' (Memory Model - Total Store Order)">;
 def NotHasStdExtZtso : Predicate<"!Subtarget->hasStdExtZtso()">;
 
+def FeatureStdExtZa164rs : SubtargetFeature<"za64rs", "HasStdExtZa64rs", 
"true",
+"'Za64rs' (Reservation Set Size of 
at Most 64 Bytes)">;
+
+def FeatureStdExtZa128rs : SubtargetFeature<"za128rs", "HasStdExtZa128rs", 
"true",
+"'Za128rs' (Reservation Set Size 
of at Most 128 Bytes)">;
+
 def FeatureStdExtZawrs : SubtargetFeature<"zawrs", "HasStdExtZawrs", "true",
   "'Zawrs' (Wait on Reservation Set)">;
 def HasStdExtZawrs : Predicate<"Subtarget->hasStdExtZawrs()">,
diff --git a/llvm/test/CodeGen/RISCV/attributes.ll 
b/llvm/test/CodeGen/RISCV/attributes.ll
index 60ef404ac345d1..3e55e0fb4e6861 100644
--- a/llvm/test/CodeGen/RISCV/attributes.ll
+++ b/llvm/test/CodeGen/RISCV/attributes.ll
@@ -130,6 +130,7 @@
 ; RUN: llc -mtriple=riscv64 -mattr=+zkn,+zkr,+zkt %s -o - | FileCheck 
--check-prefixes=CHECK,RV64COMBINEINTOZK %s
 ; RUN: llc -mtriple=riscv64 -mattr=+zbkb,+zbkc,+zbkx,+zkne,+zknd,+zknh %s -o - 
| FileCheck --check-prefixes=CHECK,RV64COMBINEINTOZKN %s
 ; RUN: llc -mtri

[llvm] [clang] [flang] [compiler-rt] [clang-tools-extra] [Flang][OpenMP] Push genEval closer to leaf lowering functions (PR #77760)

2024-01-18 Thread Krzysztof Parzyszek via cfe-commits


kparzysz wrote:

Ping.  Are there any unanswered questions or concerns about this change?

https://github.com/llvm/llvm-project/pull/77760
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Implement CWG2598: Union of non-literal types (PR #78195)

2024-01-18 Thread via cfe-commits


cor3ntin wrote:

Looking at that again, i suspect I did not handle indirect fields properly. 
I'll try to get to that. Sorry!

https://github.com/llvm/llvm-project/pull/78195
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[flang] [compiler-rt] [clang] [llvm] [clang-tools-extra] [Flang][OpenMP] Push genEval closer to leaf lowering functions (PR #77760)

2024-01-18 Thread Kiran Chandramohan via cfe-commits


https://github.com/kiranchandramohan approved this pull request.

LG.

https://github.com/llvm/llvm-project/pull/77760
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [clang] d5000e9 - rename to 'try' isntead of 'Try'x

2024-01-18 Thread Erich Keane via cfe-commits

yeah, something weird happened with this one that i haven't figured out yet.  
This was a response to a review comment, and was on the review (and committed 
as a part of that review!) but somehow I managed to 'recommit' it, not sure how.

So yeah 🙁  I'm usually better at commit messages, but didn't intend this to be 
anything but one of many squashed on the PR.

From: David Blaikie 
Sent: Wednesday, January 17, 2024 3:32 PM
To: Erich Keane ; llvmlist...@llvm.org 
Cc: cfe-commits@lists.llvm.org 
Subject: Re: [clang] d5000e9 - rename to 'try' isntead of 'Try'x

External email: Use caution opening links or attachments

Be good to have more description of "why" in commit messages in general (the 
"what" is provided by the patch itself, especially when it's a small one like 
this) - I guess this was to match naming conventions, that this file generally 
already follows lower-first, etc.

On Tue, Jan 16, 2024 at 7:04 AM via cfe-commits 
mailto:cfe-commits@lists.llvm.org>> wrote:

Author: erichkeane
Date: 2024-01-16T07:04:28-08:00
New Revision: d5000e9cd95b720fc9082da6cdcdb2c865303dcf

URL: 
https://github.com/llvm/llvm-project/commit/d5000e9cd95b720fc9082da6cdcdb2c865303dcf
DIFF: 
https://github.com/llvm/llvm-project/commit/d5000e9cd95b720fc9082da6cdcdb2c865303dcf.diff

LOG: rename to 'try' isntead of 'Try'x

Added:


Modified:
clang/lib/Parse/ParseOpenACC.cpp

Removed:




diff  --git a/clang/lib/Parse/ParseOpenACC.cpp 
b/clang/lib/Parse/ParseOpenACC.cpp
index 018c61de4be369..a5a028e1c6a799 100644
--- a/clang/lib/Parse/ParseOpenACC.cpp
+++ b/clang/lib/Parse/ParseOpenACC.cpp
@@ -183,7 +183,7 @@ bool isTokenIdentifierOrKeyword(Parser &P, Token Tok) {
 /// Return 'true' if the special token was matched, false if no special token,
 /// or an invalid special token was found.
 template 
-bool TryParseAndConsumeSpecialTokenKind(Parser &P, OpenACCSpecialTokenKind 
Kind,
+bool tryParseAndConsumeSpecialTokenKind(Parser &P, OpenACCSpecialTokenKind 
Kind,
 DirOrClauseTy DirOrClause) {
   Token IdentTok = P.getCurToken();
   // If this is an identifier-like thing followed by ':', it is one of the
@@ -713,7 +713,7 @@ void Parser::ParseOpenACCCacheVarList() {
   // The VarList is an optional `readonly:` followed by a list of a variable
   // specifications. Consume something that looks like a 'tag', and diagnose if
   // it isn't 'readonly'.
-  if (TryParseAndConsumeSpecialTokenKind(*this,
+  if (tryParseAndConsumeSpecialTokenKind(*this,
  OpenACCSpecialTokenKind::ReadOnly,
  OpenACCDirectiveKind::Cache)) {
 // FIXME: Record that this is a 'readonly' so that we can use that during



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clang] [llvm] [AMDGPU] Update uses of new VOP2 pseudos for GFX12 (PR #78155)

2024-01-18 Thread Jay Foad via cfe-commits



@@ -1,7 +1,8 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc -march=amdgcn -mcpu=tahiti -verify-machineinstrs < %s | FileCheck 
--check-prefixes=SI %s

jayfoad wrote:

Done as part of a merge from main to fix conflicts.

https://github.com/llvm/llvm-project/pull/78155
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][Interp] Add an EvaluationResult class (PR #71315)

2024-01-18 Thread Timm Baeder via cfe-commits



@@ -54,36 +44,90 @@ bool Context::isPotentialConstantExpr(State &Parent, const 
FunctionDecl *FD) {
 bool Context::evaluateAsRValue(State &Parent, const Expr *E, APValue &Result) {
   assert(Stk.empty());
   ByteCodeExprGen C(*this, *P, Parent, Stk, Result);
-  if (Check(Parent, C.interpretExpr(E))) {
-assert(Stk.empty());
-#ifndef NDEBUG
-// Make sure we don't rely on some value being still alive in
-// InterpStack memory.
+
+  auto Res = C.interpretExpr(E);
+
+  if (Res.isInvalid()) {
 Stk.clear();
+return false;
+  }
+
+  assert(Stk.empty());
+#ifndef NDEBUG
+  // Make sure we don't rely on some value being still alive in
+  // InterpStack memory.
+  Stk.clear();
 #endif
-return true;
+
+  // Implicit lvalue-to-rvalue conversion.
+  if (E->isGLValue()) {
+std::optional RValueResult = Res.toRValue();
+if (!RValueResult) {
+  return false;
+}
+Result = *RValueResult;
+  } else {
+Result = Res.toAPValue();
   }
 
+  return true;
+}
+
+bool Context::evaluate(State &Parent, const Expr *E, APValue &Result) {
+  assert(Stk.empty());
+  ByteCodeExprGen C(*this, *P, Parent, Stk, Result);
+
+  auto Res = C.interpretExpr(E);
+  if (Res.isInvalid()) {
+Stk.clear();
+return false;
+  }
+
+  assert(Stk.empty());
+#ifndef NDEBUG
+  // Make sure we don't rely on some value being still alive in
+  // InterpStack memory.
   Stk.clear();
-  return false;
+#endif
+  Result = Res.toAPValue();
+  return true;
 }
 
 bool Context::evaluateAsInitializer(State &Parent, const VarDecl *VD,
 APValue &Result) {
   assert(Stk.empty());
   ByteCodeExprGen C(*this, *P, Parent, Stk, Result);
-  if (Check(Parent, C.interpretDecl(VD))) {
-assert(Stk.empty());
-#ifndef NDEBUG
-// Make sure we don't rely on some value being still alive in
-// InterpStack memory.
+
+  auto Res = C.interpretDecl(VD);
+  if (Res.isInvalid()) {
 Stk.clear();
-#endif
-return true;
+return false;
   }
 
+  assert(Stk.empty());
+#ifndef NDEBUG
+  // Make sure we don't rely on some value being still alive in
+  // InterpStack memory.
   Stk.clear();
-  return false;
+#endif
+
+  // Ensure global variables are fully initialized.
+  if (shouldBeGloballyIndexed(VD) && !Res.isInvalid() &&
+  (VD->getType()->isRecordType() || VD->getType()->isArrayType())) {
+assert(Res.isLValue());
+
+if (!Res.checkFullyInitialized(C.getState()))
+  return false;
+
+// lvalue-to-rvalue conversion.
+std::optional RValueResult = Res.toRValue();
+if (!RValueResult)
+  return false;
+Result = *RValueResult;

tbaederr wrote:

The two versions are slightly different: in `evaluateInitializer()`, we always 
do the conversion for global variables, but in the version in 
`evaluateAsRValue` only for `GLValues`. I could make the `EvaluationResult` API 
a littler easier to use for the cases in `Context` though, i.e. `bool 
toRValue(APValue &Result)` instead of returning the `std::optional`.

https://github.com/llvm/llvm-project/pull/71315
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

1 2 3 4 5 6 >

1 - 100 of 579 matches

Mail list logo