[llvm-branch-commits] [clang-tools-extra] 97153f5 - [clang-tidy] Update docs for bugprone-use-after-move

2021-02-19 Thread via llvm-branch-commits

Author: martinboehme
Date: 2021-02-19T11:49:25+01:00
New Revision: 97153f5439605fb8d4fccc79fefc627bc825068c

URL: 
https://github.com/llvm/llvm-project/commit/97153f5439605fb8d4fccc79fefc627bc825068c
DIFF: 
https://github.com/llvm/llvm-project/commit/97153f5439605fb8d4fccc79fefc627bc825068c.diff

LOG: [clang-tidy] Update docs for bugprone-use-after-move

- Create a separate section on silencing erroneous warnings and add more 
material to it
- Add note that the check is flow-sensitive but not path-sensitive

Added: 


Modified: 
clang-tools-extra/docs/clang-tidy/checks/bugprone-use-after-move.rst

Removed: 




diff  --git 
a/clang-tools-extra/docs/clang-tidy/checks/bugprone-use-after-move.rst 
b/clang-tools-extra/docs/clang-tidy/checks/bugprone-use-after-move.rst
index 9fde912837d8..aab7cfd0ccd4 100644
--- a/clang-tools-extra/docs/clang-tidy/checks/bugprone-use-after-move.rst
+++ b/clang-tools-extra/docs/clang-tidy/checks/bugprone-use-after-move.rst
@@ -24,6 +24,9 @@ move and before the use. For example, no warning will be 
output for this code:
 str = "Greetings, stranger!\n";
 std::cout << str;
 
+Subsections below explain more precisely what exactly the check considers to be
+a move, use, and reinitialization.
+
 The check takes control flow into account. A warning is only emitted if the use
 can be reached from the move. This means that the following code does not
 produce a warning:
@@ -60,7 +63,12 @@ mutually exclusive. For example (assuming that ``i`` is an 
int):
 }
 
 In this case, the check will erroneously produce a warning, even though it is
-not possible for both the move and the use to be executed.
+not possible for both the move and the use to be executed. More formally, the
+analysis is `flow-sensitive but not path-sensitive
+`_.
+
+Silencing erroneous warnings
+
 
 An erroneous warning can be silenced by reinitializing the object after the
 move:
@@ -75,8 +83,30 @@ move:
   std::cout << str;
 }
 
-Subsections below explain more precisely what exactly the check considers to be
-a move, use, and reinitialization.
+If you want to avoid the overhead of actually reinitializing the object, you 
can
+create a dummy function that causes the check to assume the object was
+reinitialized:
+
+.. code-block:: c++
+
+template 
+void IS_INITIALIZED(T&) {}
+
+You can use this as follows:
+
+.. code-block:: c++
+
+if (i == 1) {
+  messages.emplace_back(std::move(str));
+}
+if (i == 2) {
+  IS_INITIALIZED(str);
+  std::cout << str;
+}
+
+The check will not output a warning in this case because passing the object to 
a
+function as a non-const pointer or reference counts as a reinitialization (see 
section
+`Reinitialization`_ below).
 
 Unsequenced moves, uses, and reinitializations
 --



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] dda7ef0 - [PowerPC] Update release notes for changes to PowerPC for V12.0

2021-02-19 Thread Lei Huang via llvm-branch-commits

Author: Lei Huang
Date: 2021-02-19T19:24:05Z
New Revision: dda7ef025bc66ea326f5a8bda8c5b8534d21c2dd

URL: 
https://github.com/llvm/llvm-project/commit/dda7ef025bc66ea326f5a8bda8c5b8534d21c2dd
DIFF: 
https://github.com/llvm/llvm-project/commit/dda7ef025bc66ea326f5a8bda8c5b8534d21c2dd.diff

LOG: [PowerPC] Update release notes for changes to PowerPC for V12.0

Added: 


Modified: 
llvm/docs/ReleaseNotes.rst

Removed: 




diff  --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index c1bda3339a9e..542a505bfd2e 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -129,7 +129,75 @@ During this release ...
 Changes to the PowerPC Target
 -
 
-During this release ...
+Optimization:
+
+* Made improvements to loop unroll-and-jam including fix to respect user
+  provided #pragma unroll-and-jam for loops on targets other than ARM.
+* Improved PartialInliner allowing it to handle code regions in a switch
+  statements.
+* Improved PGO support on AIX by building and linking with compiler-rt profile
+  library.
+* Add support for Epilogue Vectorization and enabled it by default.
+
+CodeGen:
+
+* POWER10 support
+  * Implementation of PC Relative addressing in LLD including the associated
+linker optimizations.
+  * Add support for the new matrix multiplication (MMA) instructions to Clang
+and LLVM.
+  * Implementation of Power10 builtins.
+
+* Scheduling enhancements
+  * Add a new algorithm to cluster more loads/stores if the DAG is not too
+complicated.
+  * Enable the PowerPC scheduling heuristic for Power10.
+
+* Target dependent passes tuning
+  * Enhance LoopStrengthReduce/PPCLoopInstrFormPrep pass for PowerPC,
+especially for P10 intrinsics.
+  * Enhance machine combiner pass to reduce register pressure for PowerPC.
+  * Improve MachineSink to do more sinking based on register pressure and alias
+analysis.
+
+* General improvements
+  * Complete the constrained floating point operations support.
+  * Improve the llvm-exegesis support.
+  * Improve the stack clash protection to probe the gap between stackptr and
+realigned stackptr.
+  * Improve the IEEE long double support for Power8.
+  * Enable MemorySSA for LoopSink.
+  * Enhance LLVM debugging functionality via options such as -print-changed and
+-print-before-changed.
+  * Add builtins for Power9 (i.e. darn, xvtdiv, xvtsqrt etc).
+  * Add options to disable all or part of LoopIdiomRecognizePass.
+  * Add support for printing the DDG in DOT form allowing for visual inspection
+of the Data Dependence Graph.
+  * Remove the QPX support.
+  * Significant number of bug fixes including all the fixes necessary to
+achieve a clean test run for Julia.
+
+AIX Support:
+
+* Compiler-rt support
+  * Add support for building compiler-rt for AIX and 32-bit Power targets.
+  * Made compiler-rt the default rtlib for AIX.
+
+* General Improvements
+  * Enable the AIX extended AltiVec ABI under option -mabi=vec-extabi.
+  * Add partial C99 complex type support.
+  * Implemente traceback table for functions (encodes vector information,
+emits exception handling).
+  * Implemente code generation for C++ dynamic initialization and finalization.
+of non-local variables for use with the -bcdtors option of the AIX linker.
+  * Add new option -mignore-xcoff-visibility.
+  * Enable explicit sections on AIX.
+  * Enable -f[no-]data-sections on AIX and set -fdata-sections to be the 
default
+on AIX.
+  * Enable -f[no-]function-sections.
+  * Add support for relocation generation using the large code model.
+  * Add pragma align natural and sorted out pragma pack stack effect.
+
 
 Changes to the X86 Target
 -



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] ac5cc50 - [SCEV] Improve handling of pointer compares involving subtractions.

2021-02-19 Thread Florian Hahn via llvm-branch-commits

Author: Florian Hahn
Date: 2021-02-19T20:33:42Z
New Revision: ac5cc50e2598878abbeec0422ace51a020ba75c4

URL: 
https://github.com/llvm/llvm-project/commit/ac5cc50e2598878abbeec0422ace51a020ba75c4
DIFF: 
https://github.com/llvm/llvm-project/commit/ac5cc50e2598878abbeec0422ace51a020ba75c4.diff

LOG: [SCEV] Improve handling of pointer compares involving subtractions.

This patch improves handling of pointer comparisons involving
subtractions, if an offset is known to be positive.

Proof for isKnownPredicateSubIdiom: https://alive2.llvm.org/ce/z/Gfe8mS

Proof for getUDivExpr extension:a https://alive2.llvm.org/ce/z/H_G2Q0

Added: 


Modified: 
llvm/include/llvm/Analysis/ScalarEvolution.h
llvm/lib/Analysis/ScalarEvolution.cpp
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp

Removed: 




diff  --git a/llvm/include/llvm/Analysis/ScalarEvolution.h 
b/llvm/include/llvm/Analysis/ScalarEvolution.h
index c35c1db7dfe0..1c9f9b36c94c 100644
--- a/llvm/include/llvm/Analysis/ScalarEvolution.h
+++ b/llvm/include/llvm/Analysis/ScalarEvolution.h
@@ -1876,6 +1876,11 @@ class ScalarEvolution {
   bool isKnownPredicateViaConstantRanges(ICmpInst::Predicate Pred,
  const SCEV *LHS, const SCEV *RHS);
 
+  /// Test if the given expression is known to satisfy the condition described
+  /// by Pred by decomposing a subtraction.
+  bool isKnownPredicateViaSubIdiom(ICmpInst::Predicate Pred, const SCEV *LHS,
+   const SCEV *RHS);
+
   /// Try to prove the condition described by "LHS Pred RHS" by ruling out
   /// integer overflow.
   ///

diff  --git a/llvm/lib/Analysis/ScalarEvolution.cpp 
b/llvm/lib/Analysis/ScalarEvolution.cpp
index b8d55e6eb68a..d00bd8a0e2ab 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -1833,6 +1833,13 @@ ScalarEvolution::getZeroExtendExpr(const SCEV *Op, Type 
*Ty, unsigned Depth) {
   }
   }
 
+  if (auto *SM = dyn_cast(Op)) {
+if (isa(SM->getOperand(0)) ||
+isa(SM->getOperand(1)))
+  return getUMaxExpr(getZeroExtendExpr(SM->getOperand(0), Ty),
+ getZeroExtendExpr(SM->getOperand(1), Ty));
+  }
+
   // The cast wasn't folded; create an explicit cast node.
   // Recompute the insert position, as it may have been invalidated.
   if (const SCEV *S = UniqueSCEVs.FindNodeOrInsertPos(ID, IP)) return S;
@@ -3104,6 +3111,20 @@ const SCEV *ScalarEvolution::getUDivExpr(const SCEV *LHS,
  getEffectiveSCEVType(RHS->getType()) &&
  "SCEVUDivExpr operand types don't match!");
 
+  const SCEVAddExpr *Add = dyn_cast(LHS);
+  const SCEVConstant *C = dyn_cast(RHS);
+  if (Add && C && Add->getNumOperands() == 2) {
+unsigned MultTrailing = C->getAPInt().countTrailingZeros();
+auto *NegOp0 = getNegativeSCEV(Add->getOperand(0));
+if (GetMinTrailingZeros(Add->getOperand(0)) >= MultTrailing &&
+GetMinTrailingZeros(Add->getOperand(1)) >= MultTrailing &&
+isKnownPositive(NegOp0) &&
+isKnownPredicate(CmpInst::ICMP_SGE, Add->getOperand(1), NegOp0)) {
+  return getMinusSCEV(getUDivExactExpr(Add->getOperand(1), RHS),
+  getUDivExactExpr(NegOp0, RHS));
+}
+  }
+
   FoldingSetNodeID ID;
   ID.AddInteger(scUDivExpr);
   ID.AddPointer(LHS);
@@ -9113,7 +9134,6 @@ ScalarEvolution::howFarToZero(const SCEV *V, const Loop 
*L, bool ControlsExit,
   // First compute the unsigned distance from zero in the direction of Step.
   bool CountDown = StepC->getAPInt().isNegative();
   const SCEV *Distance = CountDown ? Start : getNegativeSCEV(Start);
-
   // Handle unitary steps, which cannot wraparound.
   // 1*N = -Start; -1*N = Start (mod 2^BW), so:
   //   N = Distance (as unsigned)
@@ -10095,7 +10115,10 @@ bool ScalarEvolution::isLoopEntryGuardedByCond(const 
Loop *L,
  "LHS is not available at Loop Entry");
   assert(isAvailableAtLoopEntry(RHS, L) &&
  "RHS is not available at Loop Entry");
-  return isBasicBlockEntryGuardedByCond(L->getHeader(), Pred, LHS, RHS);
+  if (isBasicBlockEntryGuardedByCond(L->getHeader(), Pred, LHS, RHS))
+return true;
+  return isBasicBlockEntryGuardedByCond(
+  L->getHeader(), Pred, applyLoopGuards(LHS, L), applyLoopGuards(RHS, L));
 }
 
 bool ScalarEvolution::isImpliedCond(ICmpInst::Predicate Pred, const SCEV *LHS,
@@ -10934,10 +10957,28 @@ static bool 
isKnownPredicateExtendIdiom(ICmpInst::Predicate Pred,
   return false;
 }
 
+bool ScalarEvolution::isKnownPredicateViaSubIdiom(ICmpInst::Predicate Pred,
+  const SCEV *LHS,
+  const SCEV *RHS) {
+  // Handle X + Y <= Y, if X is negative and abs(X) <= Y. In that case, the
+  // expression won't wrap in the unsigned sense.
+  auto *Add = dyn_cast(LHS);
+  if (Add && Pred == CmpInst::ICMP_ULE) {
+auto *X = Add

[llvm-branch-commits] [llvm] c2a0b08 - [DCE] Add tests for non-willreturn function being removed (NFC)

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2021-02-19T16:32:05-08:00
New Revision: c2a0b0810a40199ec94c90539b601ba72bcb3523

URL: 
https://github.com/llvm/llvm-project/commit/c2a0b0810a40199ec94c90539b601ba72bcb3523
DIFF: 
https://github.com/llvm/llvm-project/commit/c2a0b0810a40199ec94c90539b601ba72bcb3523.diff

LOG: [DCE] Add tests for non-willreturn function being removed (NFC)

(cherry picked from commit 4045ad6b0ccd35fe990d51b9bfdd9e7de109bdf5)

Added: 
llvm/test/Transforms/ADCE/willreturn.ll
llvm/test/Transforms/BDCE/willreturn.ll

Modified: 


Removed: 




diff  --git a/llvm/test/Transforms/ADCE/willreturn.ll 
b/llvm/test/Transforms/ADCE/willreturn.ll
new file mode 100644
index ..c3482a417cb0
--- /dev/null
+++ b/llvm/test/Transforms/ADCE/willreturn.ll
@@ -0,0 +1,17 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -adce -S < %s | FileCheck %s
+
+declare void @may_not_return(i32) nounwind readnone
+declare void @will_return(i32) nounwind readnone willreturn
+
+; FIXME: This is a miscompile.
+define void @test(i32 %a) {
+; CHECK-LABEL: @test(
+; CHECK-NEXT:ret void
+;
+  %b = add i32 %a, 1
+  call void @may_not_return(i32 %b)
+  %c = add i32 %b, 1
+  call void @will_return(i32 %c)
+  ret void
+}

diff  --git a/llvm/test/Transforms/BDCE/willreturn.ll 
b/llvm/test/Transforms/BDCE/willreturn.ll
new file mode 100644
index ..b87ab0050e7a
--- /dev/null
+++ b/llvm/test/Transforms/BDCE/willreturn.ll
@@ -0,0 +1,17 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -bdce -S < %s | FileCheck %s
+
+declare void @may_not_return(i32) nounwind readnone
+declare void @will_return(i32) nounwind readnone willreturn
+
+; FIXME: This is a miscompile.
+define void @test(i32 %a) {
+; CHECK-LABEL: @test(
+; CHECK-NEXT:ret void
+;
+  %b = add i32 %a, 1
+  call void @may_not_return(i32 %b)
+  %c = add i32 %b, 1
+  call void @will_return(i32 %c)
+  ret void
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] d1d7dc7 - [IR] Move willReturn() to Instruction

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2021-02-19T16:32:06-08:00
New Revision: d1d7dc779a296001568d855bba7843a9eb94a585

URL: 
https://github.com/llvm/llvm-project/commit/d1d7dc779a296001568d855bba7843a9eb94a585
DIFF: 
https://github.com/llvm/llvm-project/commit/d1d7dc779a296001568d855bba7843a9eb94a585.diff

LOG: [IR] Move willReturn() to Instruction

This moves the willReturn() helper from CallBase to Instruction,
so that it can be used in a more generic manner. This will make
it easier to fix additional passes (ADCE and BDCE), and will give
us one place to change if additional instructions should become
non-willreturn (e.g. there has been talk about handling volatile
operations this way).

I have also included the IntrinsicInst workaround directly in
here, so that it gets applied consistently. (As such this change
is not entirely NFC -- FuncAttrs will now use this as well.)

Differential Revision: https://reviews.llvm.org/D96992

(cherry picked from commit 370addb996138a9e3634899cf264c7621307617a)

Added: 


Modified: 
llvm/include/llvm/IR/InstrTypes.h
llvm/include/llvm/IR/Instruction.h
llvm/lib/Analysis/ValueTracking.cpp
llvm/lib/IR/Instruction.cpp
llvm/lib/Transforms/IPO/FunctionAttrs.cpp
llvm/lib/Transforms/Utils/Local.cpp

Removed: 




diff  --git a/llvm/include/llvm/IR/InstrTypes.h 
b/llvm/include/llvm/IR/InstrTypes.h
index f42ef48de6b3..955ac8e537fe 100644
--- a/llvm/include/llvm/IR/InstrTypes.h
+++ b/llvm/include/llvm/IR/InstrTypes.h
@@ -1757,9 +1757,6 @@ class CallBase : public Instruction {
 return doesNotAccessMemory() || hasFnAttr(Attribute::ReadOnly);
   }
 
-  /// Returns true if this function is guaranteed to return.
-  bool willReturn() const { return hasFnAttr(Attribute::WillReturn); }
-
   void setOnlyReadsMemory() {
 addAttribute(AttributeList::FunctionIndex, Attribute::ReadOnly);
   }

diff  --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index d2a55f89fac9..85afaed5225e 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -633,6 +633,10 @@ class Instruction : public User,
   /// generated program.
   bool isSafeToRemove() const;
 
+  /// Return true if the instruction will return (unwinding is considered as
+  /// a form of returning control flow here).
+  bool willReturn() const;
+
   /// Return true if the instruction is a variety of EH-block.
   bool isEHPad() const {
 switch (getOpcode()) {

diff  --git a/llvm/lib/Analysis/ValueTracking.cpp 
b/llvm/lib/Analysis/ValueTracking.cpp
index 5600a3b33750..e174c5efe424 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -5018,36 +5018,14 @@ bool 
llvm::isGuaranteedToTransferExecutionToSuccessor(const Instruction *I) {
   // arbitrary length of time, but programs aren't allowed to rely on that.
 
   // If there is no successor, then execution can't transfer to it.
-  if (const auto *CRI = dyn_cast(I))
-return !CRI->unwindsToCaller();
-  if (const auto *CatchSwitch = dyn_cast(I))
-return !CatchSwitch->unwindsToCaller();
-  if (isa(I))
-return false;
   if (isa(I))
 return false;
   if (isa(I))
 return false;
 
-  // Calls can throw, or contain an infinite loop, or kill the process.
-  if (const auto *CB = dyn_cast(I)) {
-// Call sites that throw have implicit non-local control flow.
-if (!CB->doesNotThrow())
-  return false;
-
-// A function which doens't throw and has "willreturn" attribute will
-// always return.
-if (CB->hasFnAttr(Attribute::WillReturn))
-  return true;
-
-// FIXME: Temporarily assume that all side-effect free intrinsics will
-// return. Remove this workaround once all intrinsics are appropriately
-// annotated.
-return isa(CB) && CB->onlyReadsMemory();
-  }
-
-  // Other instructions return normally.
-  return true;
+  // An instruction that returns without throwing must transfer control flow
+  // to a successor.
+  return !I->mayThrow() && I->willReturn();
 }
 
 bool llvm::isGuaranteedToTransferExecutionToSuccessor(const BasicBlock *BB) {

diff  --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index 1e3fcd672a43..246180e72172 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -633,6 +633,16 @@ bool Instruction::isSafeToRemove() const {
  !this->isTerminator();
 }
 
+bool Instruction::willReturn() const {
+  if (const auto *CB = dyn_cast(this))
+// FIXME: Temporarily assume that all side-effect free intrinsics will
+// return. Remove this workaround once all intrinsics are appropriately
+// annotated.
+return CB->hasFnAttr(Attribute::WillReturn) ||
+   (isa(CB) && CB->onlyReadsMemory());
+  return true;
+}
+
 bool Instruction::isLifetimeStartOrEnd() const {
   auto II = dyn_cast(this);
   if (!II)

diff  --git a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp 
b/l

[llvm-branch-commits] [llvm] 8e9c2ad - [DCE] Don't remove non-willreturn calls

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2021-02-19T16:32:07-08:00
New Revision: 8e9c2ad95eb5ab439b933d8c793957bc4d82e456

URL: 
https://github.com/llvm/llvm-project/commit/8e9c2ad95eb5ab439b933d8c793957bc4d82e456
DIFF: 
https://github.com/llvm/llvm-project/commit/8e9c2ad95eb5ab439b933d8c793957bc4d82e456.diff

LOG: [DCE] Don't remove non-willreturn calls

In both ADCE and BDCE (via DemandedBits) we should not remove
instructions that are not guaranteed to return. This issue was
pointed out by fhahn in the recent llvm-dev thread.

Differential Revision: https://reviews.llvm.org/D96993

(cherry picked from commit 2f17ed294fcd8cde505b93c9c5bbab06ba59051c)

Added: 


Modified: 
llvm/lib/Analysis/DemandedBits.cpp
llvm/lib/Transforms/Scalar/ADCE.cpp
llvm/test/Feature/OperandBundles/adce.ll
llvm/test/LTO/X86/parallel.ll
llvm/test/Transforms/ADCE/dce_pure_call.ll
llvm/test/Transforms/ADCE/willreturn.ll
llvm/test/Transforms/BDCE/dce-pure.ll
llvm/test/Transforms/BDCE/dead-void-ro.ll
llvm/test/Transforms/BDCE/willreturn.ll
llvm/test/tools/gold/X86/parallel.ll

Removed: 




diff  --git a/llvm/lib/Analysis/DemandedBits.cpp 
b/llvm/lib/Analysis/DemandedBits.cpp
index 461fd7239905..dd11b0b02bf8 100644
--- a/llvm/lib/Analysis/DemandedBits.cpp
+++ b/llvm/lib/Analysis/DemandedBits.cpp
@@ -80,7 +80,7 @@ void DemandedBitsWrapperPass::print(raw_ostream &OS, const 
Module *M) const {
 
 static bool isAlwaysLive(Instruction *I) {
   return I->isTerminator() || isa(I) || I->isEHPad() ||
- I->mayHaveSideEffects();
+ I->mayHaveSideEffects() || !I->willReturn();
 }
 
 void DemandedBits::determineLiveOperandBits(

diff  --git a/llvm/lib/Transforms/Scalar/ADCE.cpp 
b/llvm/lib/Transforms/Scalar/ADCE.cpp
index 2b649732a799..ce4e5e575fbf 100644
--- a/llvm/lib/Transforms/Scalar/ADCE.cpp
+++ b/llvm/lib/Transforms/Scalar/ADCE.cpp
@@ -325,7 +325,7 @@ void AggressiveDeadCodeElimination::initialize() {
 
 bool AggressiveDeadCodeElimination::isAlwaysLive(Instruction &I) {
   // TODO -- use llvm::isInstructionTriviallyDead
-  if (I.isEHPad() || I.mayHaveSideEffects()) {
+  if (I.isEHPad() || I.mayHaveSideEffects() || !I.willReturn()) {
 // Skip any value profile instrumentation calls if they are
 // instrumenting constants.
 if (isInstrumentsConstant(I))

diff  --git a/llvm/test/Feature/OperandBundles/adce.ll 
b/llvm/test/Feature/OperandBundles/adce.ll
index a729ba710689..fa4e045fdd1e 100644
--- a/llvm/test/Feature/OperandBundles/adce.ll
+++ b/llvm/test/Feature/OperandBundles/adce.ll
@@ -5,8 +5,8 @@
 ; bundles since the presence of unknown operand bundles implies
 ; arbitrary memory effects.
 
-declare void @readonly_function() readonly nounwind
-declare void @readnone_function() readnone nounwind
+declare void @readonly_function() readonly nounwind willreturn
+declare void @readnone_function() readnone nounwind willreturn
 
 define void @test0() {
 ; CHECK-LABEL: @test0(

diff  --git a/llvm/test/LTO/X86/parallel.ll b/llvm/test/LTO/X86/parallel.ll
index b3c128193821..34235ec0202b 100644
--- a/llvm/test/LTO/X86/parallel.ll
+++ b/llvm/test/LTO/X86/parallel.ll
@@ -11,7 +11,7 @@ target triple = "x86_64-unknown-linux-gnu"
 ; CHECK0-NOT: bar
 ; CHECK0: T foo
 ; CHECK0-NOT: bar
-define void @foo() {
+define void @foo() mustprogress {
   call void @bar()
   ret void
 }
@@ -19,7 +19,7 @@ define void @foo() {
 ; CHECK1-NOT: foo
 ; CHECK1: T bar
 ; CHECK1-NOT: foo
-define void @bar() {
+define void @bar() mustprogress {
   call void @foo()
   ret void
 }

diff  --git a/llvm/test/Transforms/ADCE/dce_pure_call.ll 
b/llvm/test/Transforms/ADCE/dce_pure_call.ll
index 66483abbc919..88e92bf13f49 100644
--- a/llvm/test/Transforms/ADCE/dce_pure_call.ll
+++ b/llvm/test/Transforms/ADCE/dce_pure_call.ll
@@ -1,6 +1,6 @@
 ; RUN: opt -adce -S < %s | not grep call
 
-declare i32 @strlen(i8*) readonly nounwind
+declare i32 @strlen(i8*) readonly nounwind willreturn
 
 define void @test() {
call i32 @strlen( i8* null ); :1 [#uses=0]

diff  --git a/llvm/test/Transforms/ADCE/willreturn.ll 
b/llvm/test/Transforms/ADCE/willreturn.ll
index c3482a417cb0..61bbbe0ae5fa 100644
--- a/llvm/test/Transforms/ADCE/willreturn.ll
+++ b/llvm/test/Transforms/ADCE/willreturn.ll
@@ -4,9 +4,10 @@
 declare void @may_not_return(i32) nounwind readnone
 declare void @will_return(i32) nounwind readnone willreturn
 
-; FIXME: This is a miscompile.
 define void @test(i32 %a) {
 ; CHECK-LABEL: @test(
+; CHECK-NEXT:[[B:%.*]] = add i32 [[A:%.*]], 1
+; CHECK-NEXT:call void @may_not_return(i32 [[B]])
 ; CHECK-NEXT:ret void
 ;
   %b = add i32 %a, 1

diff  --git a/llvm/test/Transforms/BDCE/dce-pure.ll 
b/llvm/test/Transforms/BDCE/dce-pure.ll
index a487a04db611..e00121d0c9e9 100644
--- a/llvm/test/Transforms/BDCE/dce-pure.ll
+++ b/llvm/test/Transforms/BDCE/dce-pure.ll
@@ -1,7 +1,7 @@
 ; RUN: opt -bdce -S < %s | FileCheck %s
 ; RUN: opt -p

[llvm-branch-commits] [lld] 17daef8 - [LLD] Fix tests after D96993

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2021-02-19T16:32:09-08:00
New Revision: 17daef8bfdfd3a78465122f968a93df6db42dca6

URL: 
https://github.com/llvm/llvm-project/commit/17daef8bfdfd3a78465122f968a93df6db42dca6
DIFF: 
https://github.com/llvm/llvm-project/commit/17daef8bfdfd3a78465122f968a93df6db42dca6.diff

LOG: [LLD] Fix tests after D96993

We now need mustprogress to eliminate these calls. The code doesn't
really make sense, but that's not the point of the test...

(cherry picked from commit ac065b7a37d6dd8daacd526f6c3a0d1563bc88ac)

Added: 


Modified: 
lld/test/ELF/lto/parallel.ll
lld/test/wasm/lto/parallel.ll

Removed: 




diff  --git a/lld/test/ELF/lto/parallel.ll b/lld/test/ELF/lto/parallel.ll
index d9cb4fed7bfa..d89431e8b4a1 100644
--- a/lld/test/ELF/lto/parallel.ll
+++ b/lld/test/ELF/lto/parallel.ll
@@ -14,7 +14,7 @@ target triple = "x86_64-unknown-linux-gnu"
 ; CHECK0-NOT: bar
 ; CHECK0: T foo
 ; CHECK0-NOT: bar
-define void @foo() {
+define void @foo() mustprogress {
   call void @bar()
   ret void
 }
@@ -22,7 +22,7 @@ define void @foo() {
 ; CHECK1-NOT: foo
 ; CHECK1: T bar
 ; CHECK1-NOT: foo
-define void @bar() {
+define void @bar() mustprogress {
   call void @foo()
   ret void
 }

diff  --git a/lld/test/wasm/lto/parallel.ll b/lld/test/wasm/lto/parallel.ll
index a93c3558d969..261cf2ef7dae 100644
--- a/lld/test/wasm/lto/parallel.ll
+++ b/lld/test/wasm/lto/parallel.ll
@@ -10,7 +10,7 @@ target triple = "wasm32-unknown-unknown-wasm"
 ; CHECK0-NOT: bar
 ; CHECK0: T foo
 ; CHECK0-NOT: bar
-define void @foo() {
+define void @foo() mustprogress {
   call void @bar()
   ret void
 }
@@ -18,7 +18,7 @@ define void @foo() {
 ; CHECK1-NOT: foo
 ; CHECK1: T bar
 ; CHECK1-NOT: foo
-define void @bar() {
+define void @bar() mustprogress {
   call void @foo()
   ret void
 }



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] a338d57 - [clang] functions with the 'const' or 'pure' attribute must always return.

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Jeroen Dobbelaere
Date: 2021-02-19T16:33:38-08:00
New Revision: a338d577bb4fbf9013cf0c22c211d25bf3c41a26

URL: 
https://github.com/llvm/llvm-project/commit/a338d577bb4fbf9013cf0c22c211d25bf3c41a26
DIFF: 
https://github.com/llvm/llvm-project/commit/a338d577bb4fbf9013cf0c22c211d25bf3c41a26.diff

LOG: [clang] functions with the 'const' or 'pure' attribute must always return.

As described in
* 
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-pure-function-attribute
* 
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-const-function-attribute

An `__attribute__((pure))` function must always return, as well as an 
`__attribute__((const))` function.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D96960

(cherry picked from commit 46757ccb49ab88da54ca8ddd43665d5255ee80f7)

Added: 


Modified: 
clang/lib/CodeGen/CGCall.cpp
clang/test/CodeGen/complex-builtins.c
clang/test/CodeGen/complex-libcalls.c
clang/test/CodeGen/function-attributes.c
clang/test/CodeGenCXX/2009-05-04-PureConstNounwind.cpp
clang/test/Sema/libbuiltins-ctype-powerpc64.c
clang/test/Sema/libbuiltins-ctype-x86_64.c

Removed: 




diff  --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 42801372189b..bc7582c67989 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1995,9 +1995,14 @@ void CodeGenModule::ConstructAttributeList(
 if (TargetDecl->hasAttr()) {
   FuncAttrs.addAttribute(llvm::Attribute::ReadNone);
   FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);
+  // gcc specifies that 'const' functions have greater restrictions than
+  // 'pure' functions, so they also cannot have infinite loops.
+  FuncAttrs.addAttribute(llvm::Attribute::WillReturn);
 } else if (TargetDecl->hasAttr()) {
   FuncAttrs.addAttribute(llvm::Attribute::ReadOnly);
   FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);
+  // gcc specifies that 'pure' functions cannot have infinite loops.
+  FuncAttrs.addAttribute(llvm::Attribute::WillReturn);
 } else if (TargetDecl->hasAttr()) {
   FuncAttrs.addAttribute(llvm::Attribute::ArgMemOnly);
   FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);

diff  --git a/clang/test/CodeGen/complex-builtins.c 
b/clang/test/CodeGen/complex-builtins.c
index 96c0e7117016..6fea8a9f028c 100644
--- a/clang/test/CodeGen/complex-builtins.c
+++ b/clang/test/CodeGen/complex-builtins.c
@@ -133,7 +133,7 @@ void foo(float f) {
 // NO__ERRNO: declare { x86_fp80, x86_fp80 } @cprojl({ x86_fp80, x86_fp80 }* 
byval({ x86_fp80, x86_fp80 }) align 16) [[NOT_READNONE]]
 // HAS_ERRNO: declare { double, double } @cproj(double, double) 
[[READNONE:#[0-9]+]]
 // HAS_ERRNO: declare <2 x float> @cprojf(<2 x float>) [[READNONE]]
-// HAS_ERRNO: declare { x86_fp80, x86_fp80 } @cprojl({ x86_fp80, x86_fp80 }* 
byval({ x86_fp80, x86_fp80 }) align 16) [[NOT_READNONE]]
+// HAS_ERRNO: declare { x86_fp80, x86_fp80 } @cprojl({ x86_fp80, x86_fp80 }* 
byval({ x86_fp80, x86_fp80 }) align 16) [[WILLRETURN_NOT_READNONE:#[0-9]+]]
 
   __builtin_cpow(f,f);   __builtin_cpowf(f,f);  __builtin_cpowl(f,f);
 
@@ -202,3 +202,4 @@ void foo(float f) {
 
 // HAS_ERRNO: attributes [[NOT_READNONE]] = { nounwind {{.*}} }
 // HAS_ERRNO: attributes [[READNONE]] = { {{.*}}readnone{{.*}} }
+// HAS_ERRNO: attributes [[WILLRETURN_NOT_READNONE]] = { nounwind willreturn 
{{.*}} }

diff  --git a/clang/test/CodeGen/complex-libcalls.c 
b/clang/test/CodeGen/complex-libcalls.c
index 9bd419a83821..44d6849c0a71 100644
--- a/clang/test/CodeGen/complex-libcalls.c
+++ b/clang/test/CodeGen/complex-libcalls.c
@@ -133,7 +133,7 @@ void foo(float f) {
 // NO__ERRNO: declare { x86_fp80, x86_fp80 } @cprojl({ x86_fp80, x86_fp80 }* 
byval({ x86_fp80, x86_fp80 }) align 16) [[NOT_READNONE]]
 // HAS_ERRNO: declare { double, double } @cproj(double, double) 
[[READNONE:#[0-9]+]]
 // HAS_ERRNO: declare <2 x float> @cprojf(<2 x float>) [[READNONE]]
-// HAS_ERRNO: declare { x86_fp80, x86_fp80 } @cprojl({ x86_fp80, x86_fp80 }* 
byval({ x86_fp80, x86_fp80 }) align 16) [[NOT_READNONE]]
+// HAS_ERRNO: declare { x86_fp80, x86_fp80 } @cprojl({ x86_fp80, x86_fp80 }* 
byval({ x86_fp80, x86_fp80 }) align 16) [[WILLRETURN_NOT_READNONE:#[0-9]+]]
 
   cpow(f,f);   cpowf(f,f);  cpowl(f,f);
 
@@ -202,3 +202,4 @@ void foo(float f) {
 
 // HAS_ERRNO: attributes [[NOT_READNONE]] = { nounwind {{.*}} }
 // HAS_ERRNO: attributes [[READNONE]] = { {{.*}}readnone{{.*}} }
+// HAS_ERRNO: attributes [[WILLRETURN_NOT_READNONE]] = { nounwind willreturn 
{{.*}} }

diff  --git a/clang/test/CodeGen/function-attributes.c 
b/clang/test/CodeGen/function-attributes.c
index ffb86a6cd272..f14f24801006 100644
--- a/clang/test/CodeGen/function-attributes.c
+++ b/clang/test/CodeGen/function-attributes.c
@@ -115,5 +115,5 @@ void f20(void) {
 // CHECK: attribute

[llvm-branch-commits] [openmp] 2f74c22 - [OpenMP][NVPTX] Add the support for CUDA 11.2 and CUDA 11.1

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Shilei Tian
Date: 2021-02-19T17:14:23-08:00
New Revision: 2f74c22048277d255078d376b55dd40dddbaa376

URL: 
https://github.com/llvm/llvm-project/commit/2f74c22048277d255078d376b55dd40dddbaa376
DIFF: 
https://github.com/llvm/llvm-project/commit/2f74c22048277d255078d376b55dd40dddbaa376.diff

LOG: [OpenMP][NVPTX] Add the support for CUDA 11.2 and CUDA 11.1

CUDA 11.2 and CUDA 11.1 are all available now.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D97004

(cherry picked from commit 89827fd404f954605663776e746ec351bde61348)

Added: 


Modified: 
openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt

Removed: 




diff  --git a/openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt 
b/openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
index b705e0bb6a9f..5478cd3f6aea 100644
--- a/openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
+++ b/openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
@@ -152,8 +152,8 @@ add_custom_target(omptarget-nvptx-bc)
 
 # This map is from clang/lib/Driver/ToolChains/Cuda.cpp.
 # The last element is the default case.
-set(cuda_version_list 110 102 101 100 92 91 90 80)
-set(ptx_feature_list 70 65 64 63 61 61 60 42)
+set(cuda_version_list 112 111 110 102 101 100 92 91 90 80)
+set(ptx_feature_list 71 71 70 65 64 63 61 61 60 42)
 # The following two lines of ugly code is not needed when the minimal CMake
 # version requirement is 3.17+.
 list(LENGTH cuda_version_list num_version_supported)



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] 34e8fd5 - [clangd] Treat "null" optional fields as missing

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Kadir Cetinkaya
Date: 2021-02-19T19:38:58-08:00
New Revision: 34e8fd50391923ec4d81ec98837655107071

URL: 
https://github.com/llvm/llvm-project/commit/34e8fd50391923ec4d81ec98837655107071
DIFF: 
https://github.com/llvm/llvm-project/commit/34e8fd50391923ec4d81ec98837655107071.diff

LOG: [clangd] Treat "null" optional fields as missing

Clangd currently throws away any protocol messages whenever an optional
field has an unexpected type. This patch changes the behaviour to treat
`null` fields as missing.

This enables clangd to be more tolerant against small violations to the
LSP spec.

Fixes https://github.com/clangd/vscode-clangd/issues/134

Differential Revision: https://reviews.llvm.org/D95229

(cherry picked from commit af20232b8e189335da571f48c2467b244b7fd772)

Added: 


Modified: 
clang-tools-extra/clangd/Protocol.cpp

Removed: 




diff  --git a/clang-tools-extra/clangd/Protocol.cpp 
b/clang-tools-extra/clangd/Protocol.cpp
index 78110dc0de60..76cf813e6808 100644
--- a/clang-tools-extra/clangd/Protocol.cpp
+++ b/clang-tools-extra/clangd/Protocol.cpp
@@ -27,6 +27,21 @@
 
 namespace clang {
 namespace clangd {
+namespace {
+
+// Helper that doesn't treat `null` and absent fields as failures.
+template 
+bool mapOptOrNull(const llvm::json::Value &Params, llvm::StringLiteral Prop,
+  T &Out, llvm::json::Path P) {
+  auto *O = Params.getAsObject();
+  assert(O);
+  auto *V = O->get(Prop);
+  // Field is missing or null.
+  if (!V || V->getAsNull().hasValue())
+return true;
+  return fromJSON(*V, Out, P.field(Prop));
+}
+} // namespace
 
 char LSPError::ID;
 
@@ -490,7 +505,7 @@ bool fromJSON(const llvm::json::Value &Params, 
DidChangeTextDocumentParams &R,
   return O && O.map("textDocument", R.textDocument) &&
  O.map("contentChanges", R.contentChanges) &&
  O.map("wantDiagnostics", R.wantDiagnostics) &&
- O.mapOptional("forceRebuild", R.forceRebuild);
+ mapOptOrNull(Params, "forceRebuild", R.forceRebuild, P);
 }
 
 bool fromJSON(const llvm::json::Value &E, FileChangeType &Out,
@@ -580,10 +595,10 @@ bool fromJSON(const llvm::json::Value &Params, Diagnostic 
&R,
   llvm::json::Path P) {
   llvm::json::ObjectMapper O(Params, P);
   return O && O.map("range", R.range) && O.map("message", R.message) &&
- O.mapOptional("severity", R.severity) &&
- O.mapOptional("category", R.category) &&
- O.mapOptional("code", R.code) && O.mapOptional("source", R.source);
-  return true;
+ mapOptOrNull(Params, "severity", R.severity, P) &&
+ mapOptOrNull(Params, "category", R.category, P) &&
+ mapOptOrNull(Params, "code", R.code, P) &&
+ mapOptOrNull(Params, "source", R.source, P);
 }
 
 llvm::json::Value toJSON(const PublishDiagnosticsParams &PDP) {
@@ -818,7 +833,7 @@ bool fromJSON(const llvm::json::Value &Params, 
CompletionContext &R,
   llvm::json::ObjectMapper O(Params, P);
   int TriggerKind;
   if (!O || !O.map("triggerKind", TriggerKind) ||
-  !O.mapOptional("triggerCharacter", R.triggerCharacter))
+  !mapOptOrNull(Params, "triggerCharacter", R.triggerCharacter, P))
 return false;
   R.triggerKind = static_cast(TriggerKind);
   return true;
@@ -1121,8 +1136,8 @@ bool fromJSON(const llvm::json::Value &Params, 
ConfigurationSettings &S,
   llvm::json::ObjectMapper O(Params, P);
   if (!O)
 return true; // 'any' type in LSP.
-  return O.mapOptional("compilationDatabaseChanges",
-   S.compilationDatabaseChanges);
+  return mapOptOrNull(Params, "compilationDatabaseChanges",
+  S.compilationDatabaseChanges, P);
 }
 
 bool fromJSON(const llvm::json::Value &Params, InitializationOptions &Opts,
@@ -1133,8 +1148,8 @@ bool fromJSON(const llvm::json::Value &Params, 
InitializationOptions &Opts,
 
   return fromJSON(Params, Opts.ConfigSettings, P) &&
  O.map("compilationDatabasePath", Opts.compilationDatabasePath) &&
- O.mapOptional("fallbackFlags", Opts.fallbackFlags) &&
- O.mapOptional("clangdFileStatus", Opts.FileStatus);
+ mapOptOrNull(Params, "fallbackFlags", Opts.fallbackFlags, P) &&
+ mapOptOrNull(Params, "clangdFileStatus", Opts.FileStatus, P);
 }
 
 bool fromJSON(const llvm::json::Value &E, TypeHierarchyDirection &Out,
@@ -1190,10 +1205,11 @@ bool fromJSON(const llvm::json::Value &Params, 
TypeHierarchyItem &I,
   return O && O.map("name", I.name) && O.map("kind", I.kind) &&
  O.map("uri", I.uri) && O.map("range", I.range) &&
  O.map("selectionRange", I.selectionRange) &&
- O.mapOptional("detail", I.detail) &&
- O.mapOptional("deprecated", I.deprecated) &&
- O.mapOptional("parents", I.parents) &&
- O.mapOptional("children", I.children) && O.mapOptional("data", 
I.data);
+ mapOptOrNull(Params, "detail", I.detail, P) &&
+ mapOptOrNull(Para

[llvm-branch-commits] [llvm] b1106a5 - [llvm-dwp] Join dwo paths correctly when DWOPath is absolute

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Simonas Kazlauskas
Date: 2021-02-19T19:41:52-08:00
New Revision: b1106a5b3bc94f6da11682007d101823f81bad30

URL: 
https://github.com/llvm/llvm-project/commit/b1106a5b3bc94f6da11682007d101823f81bad30
DIFF: 
https://github.com/llvm/llvm-project/commit/b1106a5b3bc94f6da11682007d101823f81bad30.diff

LOG: [llvm-dwp] Join dwo paths correctly when DWOPath is absolute

When the `DWOPath` is absolute, we want to use `DWOPath` as is, without 
prepending any other
components to the path. The `sys::path::append` does not join, but rather 
unconditionally appends
the paths, so something like `sys::path::append("/tmp", "/tmp/banana")` will 
result in
`/tmp/tmp/banana` rather than the desired `/tmp/banana`.

This then causes `llvm-dwp` to fail in a following situation:

```
$ clang -gsplit-dwarf /tmp/banana/test.c -c -o /tmp/outdir/foo.o
$ clang outdir/foo.o -o outdir/hm
$ llvm-dwarfdump outdir/hm | grep -C2 foo.dwo
  DW_AT_comp_dir("/tmp")
  DW_AT_GNU_pubnames  (true)
  DW_AT_GNU_dwo_name("/tmp/outdir/foo.dwo")
DW_AT_GNU_dwo_id(0xde4d396f3bf0e257)
  DW_AT_low_pc  (0x00401100)
$ strace -o trace llvm-dwp -e outdir/hm -o outdir/hm.dwp
error: No such file or directory
$ cat trace | grep foo.dwo
openat(AT_FDCWD, "/tmp/tmp/outdir/foo.dwo", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
```

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D96678

(cherry picked from commit 6ffcb2937c96bd0d7a55b984b5eb8f381b68e322)

Added: 
llvm/test/tools/llvm-dwp/X86/absolute_paths.test

Modified: 
llvm/tools/llvm-dwp/llvm-dwp.cpp

Removed: 




diff  --git a/llvm/test/tools/llvm-dwp/X86/absolute_paths.test 
b/llvm/test/tools/llvm-dwp/X86/absolute_paths.test
new file mode 100644
index ..1e3d27e7323b
--- /dev/null
+++ b/llvm/test/tools/llvm-dwp/X86/absolute_paths.test
@@ -0,0 +1,37 @@
+; RUN: rm -rf %t
+; RUN: mkdir -p %t
+; RUN: llc %s -mtriple=x86_64-linux --split-dwarf-file=%t/test.dwo 
--split-dwarf-output=%t/test.dwo --filetype=obj -o %t/test.o
+; RUN: llvm-dwarfdump -v %t/test.dwo | FileCheck %s -DPATH=%t
+; RUN: llvm-dwp -e %t/test.o -o %t/test.dwp
+; RUN: llvm-dwarfdump -v %t/test.dwp | FileCheck %s -DPATH=%t
+
+; CHECK-LABEL: .debug_abbrev.dwo contents:
+; CHECK: DW_AT_name
+; CHECK: DW_AT_GNU_dwo_name
+; CHECK: DW_AT_name
+; CHECK-LABEL: .debug_str.dwo contents:
+; CHECK: "banana"
+; CHECK: "/tmp/test.c"
+; CHECK: "[[PATH]]/test.dwo"
+
+define void @banana() !dbg !8 {
+  ret void, !dbg !12
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!3, !4, !5, !6}
+!llvm.ident = !{!7}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang 
version 11.0.1", isOptimized: true, runtimeVersion: 0, splitDebugFilename: 
"test.dwo", emissionKind: FullDebug, enums: !2, splitDebugInlining: false, 
nameTableKind: GNU)
+!1 = !DIFile(filename: "/tmp/test.c", directory: "/tmp")
+!2 = !{}
+!3 = !{i32 7, !"Dwarf Version", i32 4}
+!4 = !{i32 2, !"Debug Info Version", i32 3}
+!5 = !{i32 1, !"wchar_size", i32 4}
+!6 = !{i32 7, !"PIC Level", i32 2}
+!7 = !{!"clang version 11.0.1"}
+!8 = distinct !DISubprogram(name: "banana", scope: !9, file: !9, line: 1, 
type: !10, scopeLine: 1, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, 
spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !2)
+!9 = !DIFile(filename: "test.c", directory: "/tmp")
+!10 = !DISubroutineType(types: !11)
+!11 = !{null}
+!12 = !DILocation(line: 1, column: 20, scope: !8)

diff  --git a/llvm/tools/llvm-dwp/llvm-dwp.cpp 
b/llvm/tools/llvm-dwp/llvm-dwp.cpp
index 9aed3526b0aa..d495bd3d4cab 100644
--- a/llvm/tools/llvm-dwp/llvm-dwp.cpp
+++ b/llvm/tools/llvm-dwp/llvm-dwp.cpp
@@ -526,8 +526,8 @@ getDWOFilenames(StringRef ExecFilename) {
 std::string DWOCompDir =
 dwarf::toString(Die.find(dwarf::DW_AT_comp_dir), "");
 if (!DWOCompDir.empty()) {
-  SmallString<16> DWOPath;
-  sys::path::append(DWOPath, DWOCompDir, DWOName);
+  SmallString<16> DWOPath(std::move(DWOName));
+  sys::fs::make_absolute(DWOCompDir, DWOPath);
   DWOPaths.emplace_back(DWOPath.data(), DWOPath.size());
 } else {
   DWOPaths.push_back(std::move(DWOName));



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 0d4f8a3 - [llvm-symbolizer] - Fix the crash in GNU output style with --no-inlines and missing input file.

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Georgii Rymar
Date: 2021-02-19T21:12:53-08:00
New Revision: 0d4f8a3f394f55b5fde7033bf009e5dacea1a775

URL: 
https://github.com/llvm/llvm-project/commit/0d4f8a3f394f55b5fde7033bf009e5dacea1a775
DIFF: 
https://github.com/llvm/llvm-project/commit/0d4f8a3f394f55b5fde7033bf009e5dacea1a775.diff

LOG: [llvm-symbolizer] - Fix the crash in GNU output style with --no-inlines 
and missing input file.

Fixes https://bugs.llvm.org/show_bug.cgi?id=48882.

If the input file does not exist (or has a reading error), the
following code will crash if there are two or more input addresses.

```
auto ResOrErr = Symbolizer.symbolizeInlinedCode(
  ModuleName, {Offset, object::SectionedAddress::UndefSection});
Printer << (error(ResOrErr) ? DILineInfo() : ResOrErr.get().getFrame(0));
```

For the first address, `symbolizeInlinedCode` returns an error.
For the second address, `symbolizeInlinedCode` returns an empty result
(not an error) and `.getFrame(0)` will crash.

Differential revision: https://reviews.llvm.org/D95609

(cherry picked from commit d22140687500f90830fe416d9c1e317f7c4535d5)

Added: 


Modified: 
llvm/test/tools/llvm-symbolizer/output-style-inlined.test
llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp

Removed: 




diff  --git a/llvm/test/tools/llvm-symbolizer/output-style-inlined.test 
b/llvm/test/tools/llvm-symbolizer/output-style-inlined.test
index 7e9f7e7ce180..1b8e3a2f22fb 100644
--- a/llvm/test/tools/llvm-symbolizer/output-style-inlined.test
+++ b/llvm/test/tools/llvm-symbolizer/output-style-inlined.test
@@ -28,3 +28,24 @@ RUN:   | FileCheck %s --check-prefix=LLVM 
--implicit-check-not=inctwo
 
 LLVM: main
 GNU: inctwo
+
+## Check that we are able to produce an output properly when the --no-inlines 
option
+## is specified, but a file doesn't exist. Check we report an error.
+
+RUN: llvm-symbolizer --output-style=GNU --obj=%p/Inputs/not.exist 0x1 0x2 
--no-inlines 2>&1 \
+RUN:   | FileCheck %s --check-prefix=NOT-EXIST-GNU -DMSG=%errc_ENOENT
+RUN: llvm-symbolizer --output-style=LLVM --obj=%p/Inputs/not.exist 0x1 0x2 
--no-inlines 2>&1 \
+RUN:   | FileCheck %s --check-prefix=NOT-EXIST-LLVM -DMSG=%errc_ENOENT
+
+# NOT-EXIST-GNU:  LLVMSymbolizer: error reading file: [[MSG]]
+# NOT-EXIST-GNU-NEXT: ??
+# NOT-EXIST-GNU-NEXT: ??:0
+# NOT-EXIST-GNU-NEXT: ??
+# NOT-EXIST-GNU-NEXT: ??:0
+
+# NOT-EXIST-LLVM:   LLVMSymbolizer: error reading file: [[MSG]]
+# NOT-EXIST-LLVM-NEXT:  ??
+# NOT-EXIST-LLVM-NEXT:  ??:0:0
+# NOT-EXIST-LLVM-EMPTY:
+# NOT-EXIST-LLVM-NEXT:  ??
+# NOT-EXIST-LLVM-NEXT:  ??:0:0

diff  --git a/llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp 
b/llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
index 9c68acee0ae2..8734c2d74045 100644
--- a/llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
+++ b/llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
@@ -181,7 +181,12 @@ static void symbolizeInput(const opt::InputArgList &Args, 
uint64_t AdjustVMA,
 // the topmost function, which suits our needs better.
 auto ResOrErr = Symbolizer.symbolizeInlinedCode(
 ModuleName, {Offset, object::SectionedAddress::UndefSection});
-Printer << (error(ResOrErr) ? DILineInfo() : ResOrErr.get().getFrame(0));
+if (!ResOrErr || ResOrErr->getNumberOfFrames() == 0) {
+  error(ResOrErr);
+  Printer << DILineInfo();
+} else {
+  Printer << ResOrErr->getFrame(0);
+}
   } else {
 auto ResOrErr = Symbolizer.symbolizeCode(
 ModuleName, {Offset, object::SectionedAddress::UndefSection});



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] d3f9f51 - [SROA] Propagate correct TBAA/TBAA Struct offsets

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: William S. Moses
Date: 2021-02-19T21:14:48-08:00
New Revision: d3f9f512a47f10d27a9e32edaaa7513a64b0ec17

URL: 
https://github.com/llvm/llvm-project/commit/d3f9f512a47f10d27a9e32edaaa7513a64b0ec17
DIFF: 
https://github.com/llvm/llvm-project/commit/d3f9f512a47f10d27a9e32edaaa7513a64b0ec17.diff

LOG: [SROA] Propagate correct TBAA/TBAA Struct offsets

SROA does not correctly account for offsets in TBAA/TBAA struct metadata.
This patch creates functionality for generating new MD with the corresponding
offset and updates SROA to use this functionality.

Differential Revision: https://reviews.llvm.org/D95826

(cherry picked from commit 40862b1a7486a969ff044cd240aad24f4183cc10)

Added: 
llvm/test/Transforms/SROA/tbaa-struct2.ll

Modified: 
llvm/include/llvm/IR/Metadata.h
llvm/include/llvm/IR/Operator.h
llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
llvm/lib/IR/Operator.cpp
llvm/lib/Transforms/Scalar/SROA.cpp
llvm/test/Transforms/SROA/basictest.ll

Removed: 




diff  --git a/llvm/include/llvm/IR/Metadata.h b/llvm/include/llvm/IR/Metadata.h
index 0b87416befe9..9a4480b75a30 100644
--- a/llvm/include/llvm/IR/Metadata.h
+++ b/llvm/include/llvm/IR/Metadata.h
@@ -667,6 +667,12 @@ struct AAMDNodes {
   /// The tag specifying the noalias scope.
   MDNode *NoAlias = nullptr;
 
+  // Shift tbaa Metadata node to start off bytes later
+  static MDNode *ShiftTBAA(MDNode *M, size_t off);
+
+  // Shift tbaa.struct Metadata node to start off bytes later
+  static MDNode *ShiftTBAAStruct(MDNode *M, size_t off);
+
   /// Given two sets of AAMDNodes that apply to the same pointer,
   /// give the best AAMDNodes that are compatible with both (i.e. a set of
   /// nodes whose allowable aliasing conclusions are a subset of those
@@ -680,6 +686,18 @@ struct AAMDNodes {
 Result.NoAlias = Other.NoAlias == NoAlias ? NoAlias : nullptr;
 return Result;
   }
+
+  /// Create a new AAMDNode that describes this AAMDNode after applying a
+  /// constant offset to the start of the pointer
+  AAMDNodes shift(size_t Offset) {
+AAMDNodes Result;
+Result.TBAA = TBAA ? ShiftTBAA(TBAA, Offset) : nullptr;
+Result.TBAAStruct =
+TBAAStruct ? ShiftTBAAStruct(TBAAStruct, Offset) : nullptr;
+Result.Scope = Scope;
+Result.NoAlias = NoAlias;
+return Result;
+  }
 };
 
 // Specialize DenseMapInfo for AAMDNodes.

diff  --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h
index acfacbd6c74e..945f7e46e142 100644
--- a/llvm/include/llvm/IR/Operator.h
+++ b/llvm/include/llvm/IR/Operator.h
@@ -568,6 +568,11 @@ class GEPOperator
   bool accumulateConstantOffset(
   const DataLayout &DL, APInt &Offset,
   function_ref ExternalAnalysis = nullptr) const;
+
+  static bool accumulateConstantOffset(
+  Type *SourceType, ArrayRef Index, const DataLayout &DL,
+  APInt &Offset,
+  function_ref ExternalAnalysis = nullptr);
 };
 
 class PtrToIntOperator

diff  --git a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp 
b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
index 7d97fc5da9b0..268acb682cf1 100644
--- a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
@@ -737,3 +737,84 @@ bool TypeBasedAAWrapperPass::doFinalization(Module &M) {
 void TypeBasedAAWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {
   AU.setPreservesAll();
 }
+
+MDNode *AAMDNodes::ShiftTBAA(MDNode *MD, size_t Offset) {
+  // Fast path if there's no offset
+  if (Offset == 0)
+return MD;
+  // Fast path if there's no path tbaa node (and thus scalar)
+  if (!isStructPathTBAA(MD))
+return MD;
+
+  TBAAStructTagNode Tag(MD);
+  SmallVector Sub;
+  Sub.push_back(MD->getOperand(0));
+  Sub.push_back(MD->getOperand(1));
+  ConstantInt *InnerOffset = mdconst::extract(MD->getOperand(2));
+
+  if (Tag.isNewFormat()) {
+ConstantInt *InnerSize = mdconst::extract(MD->getOperand(3));
+
+if (InnerOffset->getZExtValue() + InnerSize->getZExtValue() <= Offset) {
+  return nullptr;
+}
+
+uint64_t NewSize = InnerSize->getZExtValue();
+uint64_t NewOffset = InnerOffset->getZExtValue() - Offset;
+if (InnerOffset->getZExtValue() < Offset) {
+  NewOffset = 0;
+  NewSize -= Offset - InnerOffset->getZExtValue();
+}
+
+Sub.push_back(ConstantAsMetadata::get(
+ConstantInt::get(InnerOffset->getType(), NewOffset)));
+
+Sub.push_back(ConstantAsMetadata::get(
+ConstantInt::get(InnerSize->getType(), NewSize)));
+
+// immutable type
+if (MD->getNumOperands() >= 5)
+  Sub.push_back(MD->getOperand(4));
+  } else {
+if (InnerOffset->getZExtValue() < Offset)
+  return nullptr;
+
+Sub.push_back(ConstantAsMetadata::get(ConstantInt::get(
+InnerOffset->getType(), InnerOffset->getZExtValue() - Offset)));
+
+// immutable type
+if (MD->getNumOperands() >= 4)
+  Sub.push_back(MD->getOperand(3));
+  }

[llvm-branch-commits] [llvm] a7629a2 - [CSSPGO] Fix MSVC initializing truncation warning (NFC)

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Yang Fan
Date: 2021-02-19T21:21:11-08:00
New Revision: a7629a2244a325b908ddbd4336aef25a7049bda9

URL: 
https://github.com/llvm/llvm-project/commit/a7629a2244a325b908ddbd4336aef25a7049bda9
DIFF: 
https://github.com/llvm/llvm-project/commit/a7629a2244a325b908ddbd4336aef25a7049bda9.diff

LOG: [CSSPGO] Fix MSVC initializing truncation warning (NFC)

MSVC warning:
```
\llvm-project\llvm\include\llvm\Transforms\IPO\SampleProfileProbe.h(65): 
warning C4305: 'initializing': truncation from 'double' to 'const float'
```

Added: 


Modified: 
llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h

Removed: 




diff  --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h 
b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
index cab893b50d19..0fd79d8ff7f3 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
@@ -62,7 +62,7 @@ class PseudoProbeVerifier {
 
 private:
   // Allow a little bias due the rounding to integral factors.
-  constexpr static float DistributionFactorVariance = 0.02;
+  constexpr static float DistributionFactorVariance = 0.02f;
   // Distribution factors from last pass.
   FuncProbeFactorMap FunctionProbeFactors;
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 78b35e2 - [CSSPGO][llvm-profgen] Pseudo probe based CS profile generation

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: wlei
Date: 2021-02-19T21:21:11-08:00
New Revision: 78b35e278a9f62c2a6cfe3c974155a7e9bb60361

URL: 
https://github.com/llvm/llvm-project/commit/78b35e278a9f62c2a6cfe3c974155a7e9bb60361
DIFF: 
https://github.com/llvm/llvm-project/commit/78b35e278a9f62c2a6cfe3c974155a7e9bb60361.diff

LOG: [CSSPGO][llvm-profgen] Pseudo probe based CS profile generation

This change implements profile generation infra for pseudo probe in 
llvm-profgen. During virtual unwinding, the raw profile is extracted into range 
counter and branch counter and aggregated to sample counter map indexed by the 
call stack context. This change introduces the last step and produces the 
eventual profile. Specifically, the body of function sample is recorded by 
going through each probe among the range and callsite target sample is recorded 
by extracting the callsite probe from branch's source.

Please refer https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and 
https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.

**Implementation**

- Extended `PseudoProbeProfileGenerator` for pseudo probe based profile 
generation.
- `populateBodySamplesWithProbes` reading range counter is responsible for 
recording function body samples and inferring caller's body samples.
- `populateBoundarySamplesWithProbes` reading branch counter is responsible for 
recording call site target samples.
- Each sample is recorded with its calling context(named `ContextId`). Remind 
that the probe based context key doesn't include the leaf frame probe info, so 
the `ContextId` string is created from two part: one from the probe stack 
strings' concatenation and other one from the leaf frame probe.
- Added regression test

Test Plan:

ninja & ninja check-llvm

Differential Revision: https://reviews.llvm.org/D92998

Added: 


Modified: 
llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
llvm/tools/llvm-profgen/PerfReader.cpp
llvm/tools/llvm-profgen/ProfileGenerator.cpp
llvm/tools/llvm-profgen/ProfileGenerator.h
llvm/tools/llvm-profgen/ProfiledBinary.h
llvm/tools/llvm-profgen/PseudoProbe.cpp
llvm/tools/llvm-profgen/PseudoProbe.h

Removed: 




diff  --git a/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test 
b/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
index 109f2f63e86d..19928322a66d 100644
--- a/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
+++ b/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
@@ -1,4 +1,21 @@
 ; RUN: llvm-profgen --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript 
--binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t 
--show-unwinder-output | FileCheck %s --check-prefix=CHECK-UNWINDER
+; RUN: FileCheck %s --input-file %t
+
+; CHECK: [main:2 @ foo]:74:0
+; CHECK-NEXT: 2: 15
+; CHECK-NEXT: 3: 15
+; CHECK-NEXT: 4: 14
+; CHECK-NEXT: 5: 1
+; CHECK-NEXT: 6: 15
+; CHECK-NEXT: 8: 14 bar:14
+; CHECK-NEXT: !CFGChecksum: 138950591924
+; CHECK-NEXT:[main:2 @ foo:8 @ bar]:56:14
+; CHECK-NEXT: 1: 14
+; CHECK-NEXT: 2: 14
+; CHECK-NEXT: 3: 14
+; CHECK-NEXT: 4: 14
+; CHECK-NEXT: !CFGChecksum: 72617220756
+
 
 ; CHECK-UNWINDER:  Binary(inline-cs-pseudoprobe.perfbin)'s Range Counter:
 ; CHECK-UNWINDER-EMPTY:

diff  --git a/llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test 
b/llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
index 2ac3f06587d9..0491a62ff69b 100644
--- a/llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
+++ b/llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
@@ -1,4 +1,20 @@
 ; RUN: llvm-profgen --perfscript=%S/Inputs/noinline-cs-pseudoprobe.perfscript 
--binary=%S/Inputs/noinline-cs-pseudoprobe.perfbin --output=%t 
--show-unwinder-output | FileCheck %s --check-prefix=CHECK-UNWINDER
+; RUN: FileCheck %s --input-file %t
+
+; CHECK: [main:2 @ foo]:75:0
+; CHECK-NEXT: 2: 15
+; CHECK-NEXT: 3: 15
+; CHECK-NEXT: 4: 15
+; CHECK-NEXT: 6: 15
+; CHECK-NEXT: 8: 15 bar:15
+; CHECK-NEXT: !CFGChecksum: 138950591924
+; CHECK-NEXT:[main:2 @ foo:8 @ bar]:60:15
+; CHECK-NEXT: 1: 15
+; CHECK-NEXT: 2: 15
+; CHECK-NEXT: 3: 15
+; CHECK-NEXT: 4: 15
+; CHECK-NEXT: !CFGChecksum: 72617220756
+
 
 ; CHECK-UNWINDER:  Binary(noinline-cs-pseudoprobe.perfbin)'s Range Counter:
 ; CHECK-UNWINDER-NEXT: main:2

diff  --git a/llvm/tools/llvm-profgen/PerfReader.cpp 
b/llvm/tools/llvm-profgen/PerfReader.cpp
index d08c15808cf4..64a502be59a9 100644
--- a/llvm/tools/llvm-profgen/PerfReader.cpp
+++ b/llvm/tools/llvm-profgen/PerfReader.cpp
@@ -567,11 +567,7 @@ void PerfReader::checkAndSetPerfType(
   }
 
   if (HasHybridPerf) {
-// Set up ProfileIsCS to enable context-sensitive functionalities
-// in SampleProf
-FunctionSamples::ProfileIsCS = true;
 PerfType = PERF_LBR_STACK;
-
   } else {
 // TODO: Support other type of perf script
 PerfType = PERF_INVILID;

diff  --git

[llvm-branch-commits] [llvm] 6209b07 - [CSSPGO][llvm-profgen] Compress recursive cycles in calling context

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: wlei
Date: 2021-02-19T21:21:11-08:00
New Revision: 6209b0756d5df805f6279d3dadc8d2ba8648c3eb

URL: 
https://github.com/llvm/llvm-project/commit/6209b0756d5df805f6279d3dadc8d2ba8648c3eb
DIFF: 
https://github.com/llvm/llvm-project/commit/6209b0756d5df805f6279d3dadc8d2ba8648c3eb.diff

LOG: [CSSPGO][llvm-profgen] Compress recursive cycles in calling context

This change compresses the context string by removing cycles due to recursive 
function for CS profile generation. Removing recursion cycles is a way to 
normalize the calling context which will be better for the sample aggregation 
and also make the context promoting deterministic.
Specifically for implementation, we recognize adjacent repeated frames as 
cycles and deduplicated them through multiple round of iteration.
For example:
Considering a input context string stack:
[“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For first iteration,, it removed all adjacent repeated frames of size 1:
[“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For second iteration, it removed all adjacent repeated frames of size 2:
[“a”, “b”, “c”, “a”, “b”, “c”, “d”]
So in the end, we get compressed output:
[“a”, “b”, “c”, “d”]

Compression will be called in two place: one for sample's context key right 
after unwinding, one is for the eventual context string id in the 
ProfileGenerator.
Added a switch `compress-recursion` to control the size of duplicated frames, 
default -1 means no size limit.
Added unit tests and regression test for this.

Differential Revision: https://reviews.llvm.org/D93556

Added: 
llvm/test/tools/llvm-profgen/Inputs/recursion-compression-noprobe.perfbin
llvm/test/tools/llvm-profgen/Inputs/recursion-compression-noprobe.perfscript

llvm/test/tools/llvm-profgen/Inputs/recursion-compression-pseudoprobe.perfbin

llvm/test/tools/llvm-profgen/Inputs/recursion-compression-pseudoprobe.perfscript
llvm/test/tools/llvm-profgen/recursion-compression-noprobe.test
llvm/test/tools/llvm-profgen/recursion-compression-pseudoprobe.test
llvm/unittests/tools/llvm-profgen/CMakeLists.txt
llvm/unittests/tools/llvm-profgen/ContextCompressionTest.cpp

Modified: 
llvm/tools/llvm-profgen/PerfReader.cpp
llvm/tools/llvm-profgen/ProfileGenerator.cpp
llvm/tools/llvm-profgen/ProfileGenerator.h
llvm/tools/llvm-profgen/ProfiledBinary.cpp
llvm/tools/llvm-profgen/ProfiledBinary.h
llvm/tools/llvm-profgen/PseudoProbe.cpp
llvm/tools/llvm-profgen/PseudoProbe.h
llvm/unittests/tools/CMakeLists.txt

Removed: 




diff  --git 
a/llvm/test/tools/llvm-profgen/Inputs/recursion-compression-noprobe.perfbin 
b/llvm/test/tools/llvm-profgen/Inputs/recursion-compression-noprobe.perfbin
new file mode 100755
index ..e4e698e91099
Binary files /dev/null and 
b/llvm/test/tools/llvm-profgen/Inputs/recursion-compression-noprobe.perfbin 
diff er

diff  --git 
a/llvm/test/tools/llvm-profgen/Inputs/recursion-compression-noprobe.perfscript 
b/llvm/test/tools/llvm-profgen/Inputs/recursion-compression-noprobe.perfscript
new file mode 100644
index ..3ec8f44cfef0
--- /dev/null
+++ 
b/llvm/test/tools/llvm-profgen/Inputs/recursion-compression-noprobe.perfscript
@@ -0,0 +1,4 @@
+PERF_RECORD_MMAP2 3019402/3019402: [0x40(0x1000) @ 0 00:1d 265650677 
1451231]: r-xp recursion-compression-noprobe.perfbin
+
+ 4007e1
+ 0x4007d6/0x4007e1/P/-/-/0  0x4007c7/0x4007c0/P/-/-/0  
0x4007c7/0x4007c0/P/-/-/0  0x4007c7/0x4007c0/P/-/-/0  0x4007c7/0x4007c0/P/-/-/0 
 0x4007c7/0x4007c0/P/-/-/0  0x4007c7/0x4007c0/P/-/-/0  
0x4007c7/0x4007c0/P/-/-/0  0x4007c7/0x4007c0/P/-/-/0  0x4007c7/0x4007c0/P/-/-/0 
 0x4007c7/0x4007c0/P/-/-/0  0x400795/0x4007b0/P/-/-/0  
0x40079c/0x400790/P/-/-/0  0x400801/0x400770/P/-/-/0  0x400698/0x400801/P/-/-/0 
 0x400673/0x400696/P/-/-/0

diff  --git 
a/llvm/test/tools/llvm-profgen/Inputs/recursion-compression-pseudoprobe.perfbin 
b/llvm/test/tools/llvm-profgen/Inputs/recursion-compression-pseudoprobe.perfbin
new file mode 100755
index ..a3dbda2f0b3e
Binary files /dev/null and 
b/llvm/test/tools/llvm-profgen/Inputs/recursion-compression-pseudoprobe.perfbin 
diff er

diff  --git 
a/llvm/test/tools/llvm-profgen/Inputs/recursion-compression-pseudoprobe.perfscript
 
b/llvm/test/tools/llvm-profgen/Inputs/recursion-compression-pseudoprobe.perfscript
new file mode 100644
index ..91a69e2c9dd0
--- /dev/null
+++ 
b/llvm/test/tools/llvm-profgen/Inputs/recursion-compression-pseudoprobe.perfscript
@@ -0,0 +1,23 @@
+PERF_RECORD_MMAP2 3367317/3367317: [0x201000(0x1000) @ 0 00:1d 238458915 
1121070]: r-xp recursion-compression-pseudoprobe.perfbin
+
+ 2017db
+ 2017ba
+ 2017e5
+ 2017ba
+ 2017e5
+ 2017d9
+ 2017ba
+ 2017b0
+ 2017b0
+ 2017b0
+ 20

[llvm-branch-commits] [llvm] e562ff0 - [CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: wlei
Date: 2021-02-19T21:21:11-08:00
New Revision: e562ff08f634d814c1cd1e65e3428ca5308d3022

URL: 
https://github.com/llvm/llvm-project/commit/e562ff08f634d814c1cd1e65e3428ca5308d3022
DIFF: 
https://github.com/llvm/llvm-project/commit/e562ff08f634d814c1cd1e65e3428ca5308d3022.diff

LOG: [CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up 
profile generation

For CS profile generation, the process of call stack unwinding is 
time-consuming since for each LBR entry we need linear time to generate the 
context( hash, compression, string concatenation). This change speeds up this 
by grouping all the call frame within one LBR sample into a trie and 
aggregating the result(sample counter) on it, deferring the context compression 
and string generation to the end of unwinding.

Specifically, it uses `StackLeaf` as the top frame on the stack and 
manipulates(pop or push a trie node) it dynamically during virtual unwinding so 
that the raw sample can just be recoded on the leaf node, the path(root to 
leaf) will represent its calling context. In the end, it traverses the trie and 
generates the context on the fly.

Results:
Our internal branch shows about 5X speed-up on some large workloads in SPEC06 
benchmark.

Differential Revision: https://reviews.llvm.org/D94110

Added: 


Modified: 
llvm/tools/llvm-profgen/PerfReader.cpp
llvm/tools/llvm-profgen/PerfReader.h
llvm/tools/llvm-profgen/ProfiledBinary.cpp
llvm/tools/llvm-profgen/ProfiledBinary.h

Removed: 




diff  --git a/llvm/tools/llvm-profgen/PerfReader.cpp 
b/llvm/tools/llvm-profgen/PerfReader.cpp
index d05c665f8583..787bde28400f 100644
--- a/llvm/tools/llvm-profgen/PerfReader.cpp
+++ b/llvm/tools/llvm-profgen/PerfReader.cpp
@@ -28,11 +28,12 @@ void VirtualUnwinder::unwindCall(UnwindState &State) {
   // 2nd frame is in prolog/epilog. In the future, we will switch to
   // pro/epi tracker(Dwarf CFI) for the precise check.
   uint64_t Source = State.getCurrentLBRSource();
-  auto Iter = State.CallStack.begin();
-  if (State.CallStack.size() == 1 || *(++Iter) != Source) {
-State.CallStack.front() = Source;
+  auto *ParentFrame = State.getParentFrame();
+  if (ParentFrame == State.getDummyRootPtr() ||
+  ParentFrame->Address != Source) {
+State.switchToFrame(Source);
   } else {
-State.CallStack.pop_front();
+State.popFrame();
   }
   State.InstPtr.update(Source);
 }
@@ -41,26 +42,29 @@ void VirtualUnwinder::unwindLinear(UnwindState &State, 
uint64_t Repeat) {
   InstructionPointer &IP = State.InstPtr;
   uint64_t Target = State.getCurrentLBRTarget();
   uint64_t End = IP.Address;
-  if (State.getBinary()->usePseudoProbes()) {
+  if (Binary->usePseudoProbes()) {
+// We don't need to top frame probe since it should be extracted
+// from the range.
 // The outcome of the virtual unwinding with pseudo probes is a
 // map from a context key to the address range being unwound.
 // This means basically linear unwinding is not needed for pseudo
 // probes. The range will be simply recorded here and will be
 // converted to a list of pseudo probes to report in ProfileGenerator.
-recordRangeCount(Target, End, State, Repeat);
+State.getParentFrame()->recordRangeCount(Target, End, Repeat);
   } else {
 // Unwind linear execution part
+uint64_t LeafAddr = State.CurrentLeafFrame->Address;
 while (IP.Address >= Target) {
   uint64_t PrevIP = IP.Address;
   IP.backward();
   // Break into segments for implicit call/return due to inlining
-  bool SameInlinee =
-  State.getBinary()->inlineContextEqual(PrevIP, IP.Address);
+  bool SameInlinee = Binary->inlineContextEqual(PrevIP, IP.Address);
   if (!SameInlinee || PrevIP == Target) {
-recordRangeCount(PrevIP, End, State, Repeat);
+State.switchToFrame(LeafAddr);
+State.CurrentLeafFrame->recordRangeCount(PrevIP, End, Repeat);
 End = IP.Address;
   }
-  State.CallStack.front() = IP.Address;
+  LeafAddr = IP.Address;
 }
   }
 }
@@ -68,9 +72,9 @@ void VirtualUnwinder::unwindLinear(UnwindState &State, 
uint64_t Repeat) {
 void VirtualUnwinder::unwindReturn(UnwindState &State) {
   // Add extra frame as we unwind through the return
   const LBREntry &LBR = State.getCurrentLBR();
-  uint64_t CallAddr = State.getBinary()->getCallAddrFromFrameAddr(LBR.Target);
-  State.CallStack.front() = CallAddr;
-  State.CallStack.push_front(LBR.Source);
+  uint64_t CallAddr = Binary->getCallAddrFromFrameAddr(LBR.Target);
+  State.switchToFrame(CallAddr);
+  State.pushFrame(LBR.Source);
   State.InstPtr.update(LBR.Source);
 }
 
@@ -78,79 +82,100 @@ void VirtualUnwinder::unwindBranchWithinFrame(UnwindState 
&State) {
   // TODO: Tolerate tail call for now, as we may see tail call from libraries.
   // This is only for intra function branches, excluding tail calls.
   uint64_t So

[llvm-branch-commits] [llvm] 87c2702 - [CSSPGO][llvm-profgen] Merge and trim profile for cold context to reduce profile size

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: wlei
Date: 2021-02-19T21:21:11-08:00
New Revision: 87c27020cc6466ae33550f1f1f55d5989afaca2e

URL: 
https://github.com/llvm/llvm-project/commit/87c27020cc6466ae33550f1f1f55d5989afaca2e
DIFF: 
https://github.com/llvm/llvm-project/commit/87c27020cc6466ae33550f1f1f55d5989afaca2e.diff

LOG: [CSSPGO][llvm-profgen] Merge and trim profile for cold context to reduce 
profile size

This change allows merging and trimming cold context profile in llvm-profgen to 
solve profile size bloat problem. Currently when the profile's total sample is 
below threshold(supported by a switch), it will be considered cold and merged 
into a base context-less profile, which will at least keep the profile quality 
as good as the baseline(non-cs).

For example, two input profiles:
 [main @ foo @ bar]:60
 [main @ bar]:50
Under threshold = 100, the two profiles will be merge into one with the base 
context, get result:
 [bar]:110

Added two switches:
`--csprof-cold-thres=`: Specified the total samples threshold for a 
context profile to be considered cold, with 100 being the default. Any cold 
context profiles will be merged into context-less base profile by default.
`--csprof-keep-cold`: Force profile generation to keep cold context profiles 
instead of dropping them. By default, any cold context will not be written to 
output profile.

Results:
Though not yet evaluating it with the latest CSSPGO, our internal branch shows 
neutral on performance but significantly reduce the profile size. Detailed 
evaluation on llvm-profgen with CSSPGO will come later.

Differential Revision: https://reviews.llvm.org/D94111

Added: 
llvm/test/tools/llvm-profgen/merge-cold-profile.test

Modified: 
llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
llvm/test/tools/llvm-profgen/recursion-compression-noprobe.test
llvm/test/tools/llvm-profgen/recursion-compression-pseudoprobe.test
llvm/tools/llvm-profgen/ProfileGenerator.cpp
llvm/tools/llvm-profgen/ProfileGenerator.h

Removed: 




diff  --git a/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test 
b/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
index 98767a9b29b7..943832ebef10 100644
--- a/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
+++ b/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
@@ -1,4 +1,4 @@
-; RUN: llvm-profgen --perfscript=%S/Inputs/inline-cs-noprobe.perfscript 
--binary=%S/Inputs/inline-cs-noprobe.perfbin --output=%t --show-unwinder-output 
| FileCheck %s --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --perfscript=%S/Inputs/inline-cs-noprobe.perfscript 
--binary=%S/Inputs/inline-cs-noprobe.perfbin --output=%t --show-unwinder-output 
--csprof-cold-thres=0 | FileCheck %s --check-prefix=CHECK-UNWINDER
 ; RUN: FileCheck %s --input-file %t
 
 ; CHECK:[main:1 @ foo]:44:0

diff  --git a/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test 
b/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
index 19928322a66d..c7aa1dea21bb 100644
--- a/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
+++ b/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
@@ -1,4 +1,4 @@
-; RUN: llvm-profgen --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript 
--binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t 
--show-unwinder-output | FileCheck %s --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript 
--binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t 
--show-unwinder-output --csprof-cold-thres=0 | FileCheck %s 
--check-prefix=CHECK-UNWINDER
 ; RUN: FileCheck %s --input-file %t
 
 ; CHECK: [main:2 @ foo]:74:0

diff  --git a/llvm/test/tools/llvm-profgen/merge-cold-profile.test 
b/llvm/test/tools/llvm-profgen/merge-cold-profile.test
new file mode 100644
index ..e0c65ac44e2b
--- /dev/null
+++ b/llvm/test/tools/llvm-profgen/merge-cold-profile.test
@@ -0,0 +1,70 @@
+; Used the data from recursion-compression.test, refer it for the unmerged 
output
+; RUN: llvm-profgen 
--perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript 
--binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t 
--compress-recursion=-1 --csprof-cold-thres=8
+; RUN: FileCheck %s --input-file %t
+
+; Test --csprof-keep-cold
+; RUN: llvm-profgen 
--perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript 
--binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t 
--compress-recursion=-1 --csprof-cold-thres=100 --csprof-keep-cold
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-KEEP-COLD
+
+; CHECK: [fa]:14:4
+; CHECK-NEXT: 1: 4
+; CHECK-NEXT: 3: 4
+; CHECK-NEXT: 4: 2
+; CHECK-NEXT: 5: 1
+; CHECK-NEXT: 7: 2 fb:2
+; CHECK-NEXT: 8: 1 fa:1
+; CHECK-NEXT: !CFGChecksum: 120515930909
+; C

[llvm-branch-commits] [llvm] db88d92 - [CSSPGO][llvm-profgen] Fix bug with parsing hybrid sample trace line

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: wlei
Date: 2021-02-19T21:21:12-08:00
New Revision: db88d92217f185d9ab5b8f0a0eddc5dc9ad30659

URL: 
https://github.com/llvm/llvm-project/commit/db88d92217f185d9ab5b8f0a0eddc5dc9ad30659
DIFF: 
https://github.com/llvm/llvm-project/commit/db88d92217f185d9ab5b8f0a0eddc5dc9ad30659.diff

LOG: [CSSPGO][llvm-profgen] Fix bug with parsing hybrid sample trace line

when we skip the call stack starting with an external address, we should also 
skip the bottom LBR entry, otherwise it will cause a truncated context issue.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D95480

Added: 


Modified: 
llvm/test/tools/llvm-profgen/Inputs/inline-cs-noprobe.perfscript
llvm/tools/llvm-profgen/PerfReader.cpp

Removed: 




diff  --git a/llvm/test/tools/llvm-profgen/Inputs/inline-cs-noprobe.perfscript 
b/llvm/test/tools/llvm-profgen/Inputs/inline-cs-noprobe.perfscript
index 7ef76dcd3884..116bd0a2c4c1 100644
--- a/llvm/test/tools/llvm-profgen/Inputs/inline-cs-noprobe.perfscript
+++ b/llvm/test/tools/llvm-profgen/Inputs/inline-cs-noprobe.perfscript
@@ -1,5 +1,11 @@
 PERF_RECORD_MMAP2 2854748/2854748: [0x40(0x1000) @ 0 00:1d 123291722 
526021]: r-xp /home/inline-cs-noprobe.perfbin
 
+; test for an external or invalid top address, should skip the whole sample
+
+   
+ 40067e
+   5541f689495641d7
+ 0x4006c8/0x40067e/P/-/-/0  0x4006c8/0x40067e/P/-/-/0  
0x4006c8/0x40067e/P/-/-/0  0x4006c8/0x40067e/P/-/-/0  0x4006c8/0x40067e/P/-/-/0 
 0x4006c8/0x40067e/P/-/-/0  0x4006c8/0x40067e/P/-/-/0  
0x4006c8/0x40067e/P/-/-/0  0x4006c8/0x40067e/P/-/-/0  0x4006c8/0x40067e/P/-/-/0 
 0x4006c8/0x40067e/P/-/-/0  0x4006c8/0x40067e/P/-/-/0  
0x40069b/0x400670/M/-/-/0  0x4006c8/0x40067e/P/-/-/0  0x4006c8/0x40067e/P/-/-/0 
 0x4006c8/0x40067e/P/-/-/0
 
  40067e
5541f689495641d7

diff  --git a/llvm/tools/llvm-profgen/PerfReader.cpp 
b/llvm/tools/llvm-profgen/PerfReader.cpp
index 787bde28400f..e59d8d93381b 100644
--- a/llvm/tools/llvm-profgen/PerfReader.cpp
+++ b/llvm/tools/llvm-profgen/PerfReader.cpp
@@ -437,11 +437,12 @@ bool PerfReader::extractCallstack(TraceStream &TraceIt,
   ProfiledBinary *Binary = nullptr;
   while (!TraceIt.isAtEoF() && !TraceIt.getCurrentLine().startswith(" 0x")) {
 StringRef FrameStr = TraceIt.getCurrentLine().ltrim();
-// We might get an empty line at the beginning or comments, skip it
 uint64_t FrameAddr = 0;
 if (FrameStr.getAsInteger(16, FrameAddr)) {
+  // We might parse a non-perf sample line like empty line and comments,
+  // skip it
   TraceIt.advance();
-  break;
+  return false;
 }
 TraceIt.advance();
 if (!Binary) {
@@ -468,9 +469,9 @@ bool PerfReader::extractCallstack(TraceStream &TraceIt,
 CallStack.emplace_back(FrameAddr);
   }
 
-  if (CallStack.empty())
-return false;
   // Skip other unrelated line, find the next valid LBR line
+  // Note that even for empty call stack, we should skip the address at the
+  // bottom, otherwise the following pass may generate a truncated callstack
   while (!TraceIt.isAtEoF() && !TraceIt.getCurrentLine().startswith(" 0x")) {
 TraceIt.advance();
   }
@@ -482,7 +483,8 @@ bool PerfReader::extractCallstack(TraceStream &TraceIt,
   // of such case - when sample landed in prolog/epilog, somehow stack
   // walking will be broken in an unexpected way that higher frames will be
   // missing.
-  return !Binary->addressInPrologEpilog(CallStack.front());
+  return !CallStack.empty() &&
+ !Binary->addressInPrologEpilog(CallStack.front());
 }
 
 void PerfReader::parseHybridSample(TraceStream &TraceIt) {



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 1071279 - [CSSPGO] Use merged base profile for hot threshold calculation

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Wenlei He
Date: 2021-02-19T21:21:12-08:00
New Revision: 10712791a9affbea8e6fa474d8a857ea6dfbb955

URL: 
https://github.com/llvm/llvm-project/commit/10712791a9affbea8e6fa474d8a857ea6dfbb955
DIFF: 
https://github.com/llvm/llvm-project/commit/10712791a9affbea8e6fa474d8a857ea6dfbb955.diff

LOG: [CSSPGO] Use merged base profile for hot threshold calculation

Context-sensitive profile effectively split a function profile into many copies 
each representing the CFG profile of a particular calling context. That makes 
the count distribution looks more flat as we now have more function profiles 
each with lower counts, which in turn leads to lower hot thresholds. Now we 
tells threshold computation to merge context profile first before calculating 
percentile based cutoffs to compensate for seemingly flat context profile. This 
can be controlled by swtich `sample-profile-contextless-threshold`.

Earlier measurement showed ~0.4% perf boost with this tuning on spec2k6 for 
CSSPGO (with pseudo-probe and new inliner).

Differential Revision: https://reviews.llvm.org/D95980

Added: 
llvm/test/Transforms/SampleProfile/csspgo-summary.ll

Modified: 
llvm/include/llvm/ProfileData/ProfileCommon.h
llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
llvm/lib/ProfileData/SampleProfReader.cpp
llvm/lib/ProfileData/SampleProfWriter.cpp
llvm/test/Transforms/SampleProfile/csspgo-inline.ll

Removed: 




diff  --git a/llvm/include/llvm/ProfileData/ProfileCommon.h 
b/llvm/include/llvm/ProfileData/ProfileCommon.h
index 6bb5825339ae..55b94b2e690d 100644
--- a/llvm/include/llvm/ProfileData/ProfileCommon.h
+++ b/llvm/include/llvm/ProfileData/ProfileCommon.h
@@ -17,6 +17,7 @@
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/IR/ProfileSummary.h"
 #include "llvm/ProfileData/InstrProf.h"
+#include "llvm/ProfileData/SampleProf.h"
 #include "llvm/Support/Error.h"
 #include 
 #include 
@@ -89,6 +90,8 @@ class SampleProfileSummaryBuilder final : public 
ProfileSummaryBuilder {
 
   void addRecord(const sampleprof::FunctionSamples &FS,
  bool isCallsiteSample = false);
+  std::unique_ptr computeSummaryForProfiles(
+  const StringMap &Profiles);
   std::unique_ptr getSummary();
 };
 

diff  --git a/llvm/lib/ProfileData/ProfileSummaryBuilder.cpp 
b/llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
index d2603097c550..0e03aa50173d 100644
--- a/llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
+++ b/llvm/lib/ProfileData/ProfileSummaryBuilder.cpp
@@ -18,9 +18,14 @@
 #include "llvm/ProfileData/ProfileCommon.h"
 #include "llvm/ProfileData/SampleProf.h"
 #include "llvm/Support/Casting.h"
+#include "llvm/Support/CommandLine.h"
 
 using namespace llvm;
 
+cl::opt UseContextLessSummary(
+"profile-summary-contextless", cl::Hidden, cl::init(false), cl::ZeroOrMore,
+cl::desc("Merge context profiles before calculating thresholds."));
+
 // A set of cutoff values. Each value, when divided by ProfileSummary::Scale
 // (which is 100) is a desired percentile of total counts.
 static const uint32_t DefaultCutoffsData[] = {
@@ -111,6 +116,35 @@ std::unique_ptr 
SampleProfileSummaryBuilder::getSummary() {
   MaxFunctionCount, NumCounts, NumFunctions);
 }
 
+std::unique_ptr
+SampleProfileSummaryBuilder::computeSummaryForProfiles(
+const StringMap &Profiles) {
+  assert(NumFunctions == 0 &&
+ "This can only be called on an empty summary builder");
+  StringMap ContextLessProfiles;
+  const StringMap *ProfilesToUse = &Profiles;
+  // For CSSPGO, context-sensitive profile effectively split a function profile
+  // into many copies each representing the CFG profile of a particular calling
+  // context. That makes the count distribution looks more flat as we now have
+  // more function profiles each with lower counts, which in turn leads to 
lower
+  // hot thresholds. To compensate for that, by defauly we merge context
+  // profiles before coumputing profile summary.
+  if (UseContextLessSummary || (sampleprof::FunctionSamples::ProfileIsCS &&
+!UseContextLessSummary.getNumOccurrences())) {
+for (const auto &I : Profiles) {
+  ContextLessProfiles[I.second.getName()].merge(I.second);
+}
+ProfilesToUse = &ContextLessProfiles;
+  }
+
+  for (const auto &I : *ProfilesToUse) {
+const sampleprof::FunctionSamples &Profile = I.second;
+addRecord(Profile);
+  }
+
+  return getSummary();
+}
+
 std::unique_ptr InstrProfSummaryBuilder::getSummary() {
   computeDetailedSummary();
   return std::make_unique(

diff  --git a/llvm/lib/ProfileData/SampleProfReader.cpp 
b/llvm/lib/ProfileData/SampleProfReader.cpp
index 370ffc8e2885..38cbca844c87 100644
--- a/llvm/lib/ProfileData/SampleProfReader.cpp
+++ b/llvm/lib/ProfileData/SampleProfReader.cpp
@@ -1610,9 +1610,5 @@ SampleProfileReader::create(std::unique_ptr 
&B, LLVMContext &C,
 // profile. Binary format has the profile summary in 

[llvm-branch-commits] [llvm] e8e45f5 - [CSSPGO] Unblock optimizations with pseudo probe instrumentation.

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Hongtao Yu
Date: 2021-02-19T21:21:12-08:00
New Revision: e8e45f52d0a8268fe3ee2a3a2afc80bc10a47280

URL: 
https://github.com/llvm/llvm-project/commit/e8e45f52d0a8268fe3ee2a3a2afc80bc10a47280
DIFF: 
https://github.com/llvm/llvm-project/commit/e8e45f52d0a8268fe3ee2a3a2afc80bc10a47280.diff

LOG: [CSSPGO] Unblock optimizations with pseudo probe instrumentation.

The IR/MIR pseudo probe intrinsics don't get materialized into real machine 
instructions and therefore they don't incur runtime cost directly. However, 
they come with indirect cost by blocking certain optimizations. Some of the 
blocking are intentional (such as blocking code merge) for better counts 
quality while the others are accidental. This change unblocks perf-critical 
optimizations that do not affect counts quality. They include:

1. IR InstCombine, sinking load operation to shorten lifetimes.
2. MIR LiveRangeShrink, similar to #1
3. MIR TwoAddressInstructionPass, i.e, opeq transform
4. MIR function argument copy elision
5. IR stack protection. (though not perf-critical but nice to have).

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D95982

Added: 
llvm/test/Transforms/SampleProfile/pseudo-probe-instcombine.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-instsched.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-peep.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-twoaddr.ll

Modified: 
llvm/include/llvm/CodeGen/MachineInstr.h
llvm/include/llvm/IR/Instruction.h
llvm/lib/CodeGen/LiveRangeShrink.cpp
llvm/lib/CodeGen/MachineInstr.cpp
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
llvm/lib/CodeGen/StackProtector.cpp
llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
llvm/lib/IR/Instruction.cpp
llvm/lib/Transforms/IPO/FunctionAttrs.cpp
llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

Removed: 




diff  --git a/llvm/include/llvm/CodeGen/MachineInstr.h 
b/llvm/include/llvm/CodeGen/MachineInstr.h
index 6bbe2d03f9e5..f8d97c2c07a6 100644
--- a/llvm/include/llvm/CodeGen/MachineInstr.h
+++ b/llvm/include/llvm/CodeGen/MachineInstr.h
@@ -1156,6 +1156,10 @@ class MachineInstr
 return getOpcode() == TargetOpcode::CFI_INSTRUCTION;
   }
 
+  bool isPseudoProbe() const {
+return getOpcode() == TargetOpcode::PSEUDO_PROBE;
+  }
+  
   // True if the instruction represents a position in the function.
   bool isPosition() const { return isLabel() || isCFIInstruction(); }
 
@@ -1165,6 +1169,9 @@ class MachineInstr
   bool isDebugInstr() const {
 return isDebugValue() || isDebugLabel() || isDebugRef();
   }
+  bool isDebugOrPseudoInstr() const {
+return isDebugInstr() || isPseudoProbe();
+  }
 
   bool isDebugOffsetImm() const { return getDebugOffset().isImm(); }
 

diff  --git a/llvm/include/llvm/IR/Instruction.h 
b/llvm/include/llvm/IR/Instruction.h
index 85afaed5225e..b99dc62bbb9d 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -654,6 +654,9 @@ class Instruction : public User,
   /// llvm.lifetime.end marker.
   bool isLifetimeStartOrEnd() const;
 
+  /// Return true if the instruction is a DbgInfoIntrinsic or PseudoProbeInst.
+  bool isDebugOrPseudoInst() const;
+
   /// Return a pointer to the next non-debug instruction in the same basic
   /// block as 'this', or nullptr if no such instruction exists. Skip any 
pseudo
   /// operations if \c SkipPseudoOp is true.

diff  --git a/llvm/lib/CodeGen/LiveRangeShrink.cpp 
b/llvm/lib/CodeGen/LiveRangeShrink.cpp
index 26439a656917..7fa14fd902ef 100644
--- a/llvm/lib/CodeGen/LiveRangeShrink.cpp
+++ b/llvm/lib/CodeGen/LiveRangeShrink.cpp
@@ -156,7 +156,8 @@ bool LiveRangeShrink::runOnMachineFunction(MachineFunction 
&MF) {
 // If MI has side effects, it should become a barrier for code motion.
 // IOM is rebuild from the next instruction to prevent later
 // instructions from being moved before this MI.
-if (MI.hasUnmodeledSideEffects() && Next != MBB.end()) {
+if (MI.hasUnmodeledSideEffects() && !MI.isPseudoProbe() &&
+Next != MBB.end()) {
   BuildInstOrderMap(Next, IOM);
   SawStore = false;
 }

diff  --git a/llvm/lib/CodeGen/MachineInstr.cpp 
b/llvm/lib/CodeGen/MachineInstr.cpp
index 59d98054e3a2..b6cfd7dcbfbc 100644
--- a/llvm/lib/CodeGen/MachineInstr.cpp
+++ b/llvm/lib/CodeGen/MachineInstr.cpp
@@ -1462,7 +1462,8 @@ bool MachineInstr::hasUnmodeledSideEffects() const {
 }
 
 bool MachineInstr::isLoadFoldBarrier() const {
-  return mayStore() || isCall() || hasUnmodeledSideEffects();
+  return mayStore() || isCall() ||
+ (hasUnmodeledSideEffects() && !isPseudoProbe());
 }
 
 /// allDefsAreDead - Return true if all the defs of this instruction are dead.

diff  --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp 
b/llvm/lib/Code

[llvm-branch-commits] [llvm] 1a5bb1e - [CSSPGO] Restrict pseudo probe tests to x86_64 only.

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Hongtao Yu
Date: 2021-02-19T21:21:12-08:00
New Revision: 1a5bb1e4d540303554c0e891389f699956e5e03b

URL: 
https://github.com/llvm/llvm-project/commit/1a5bb1e4d540303554c0e891389f699956e5e03b
DIFF: 
https://github.com/llvm/llvm-project/commit/1a5bb1e4d540303554c0e891389f699956e5e03b.diff

LOG: [CSSPGO] Restrict pseudo probe tests to x86_64 only.

Added: 


Modified: 
llvm/test/Transforms/SampleProfile/pseudo-probe-instsched.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-peep.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-twoaddr.ll

Removed: 




diff  --git a/llvm/test/Transforms/SampleProfile/pseudo-probe-instsched.ll 
b/llvm/test/Transforms/SampleProfile/pseudo-probe-instsched.ll
index 609af90db610..9d89cad43aa7 100644
--- a/llvm/test/Transforms/SampleProfile/pseudo-probe-instsched.ll
+++ b/llvm/test/Transforms/SampleProfile/pseudo-probe-instsched.ll
@@ -1,5 +1,5 @@
-; PR1075
-; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-darwin 
-pseudo-probe-for-profiling -O3 | FileCheck %s
+; REQUIRES: x86_64-linux
+; RUN: llc < %s -mcpu=generic -mtriple=x86_64-- -pseudo-probe-for-profiling 
-O3 | FileCheck %s
 
 define float @foo(float %x) #0 {
   %tmp1 = fmul float %x, 3.00e+00

diff  --git a/llvm/test/Transforms/SampleProfile/pseudo-probe-peep.ll 
b/llvm/test/Transforms/SampleProfile/pseudo-probe-peep.ll
index a1fb25c95936..d94dac4de95d 100644
--- a/llvm/test/Transforms/SampleProfile/pseudo-probe-peep.ll
+++ b/llvm/test/Transforms/SampleProfile/pseudo-probe-peep.ll
@@ -1,3 +1,4 @@
+; REQUIRES: x86_64-linux
 ; RUN: llc -mtriple=x86_64-- -stop-after=peephole-opt -o - %s | FileCheck %s
 
 define internal i32 @arc_compare() {

diff  --git a/llvm/test/Transforms/SampleProfile/pseudo-probe-twoaddr.ll 
b/llvm/test/Transforms/SampleProfile/pseudo-probe-twoaddr.ll
index 81f72d3c5871..31b471ea08fd 100644
--- a/llvm/test/Transforms/SampleProfile/pseudo-probe-twoaddr.ll
+++ b/llvm/test/Transforms/SampleProfile/pseudo-probe-twoaddr.ll
@@ -1,3 +1,4 @@
+; REQUIRES: x86_64-linux
 ; RUN: llc -stop-after=twoaddressinstruction -mtriple=x86_64-- -o - %s | 
FileCheck %s
 
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 1f5e201 - [CSSPGO] Process functions in a top-down order on a dynamic call graph.

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Hongtao Yu
Date: 2021-02-19T21:21:12-08:00
New Revision: 1f5e2016be9a01e4294dcdd10b3c7b03826b26a1

URL: 
https://github.com/llvm/llvm-project/commit/1f5e2016be9a01e4294dcdd10b3c7b03826b26a1
DIFF: 
https://github.com/llvm/llvm-project/commit/1f5e2016be9a01e4294dcdd10b3c7b03826b26a1.diff

LOG: [CSSPGO] Process functions in a top-down order on a dynamic call graph.

Functions are currently processed by the sample profiler loader in a top-down 
order defined by the static call graph. The order is being adjusted to be a 
top-down order based on the input context-sensitive profile. One benefit is 
that the processing order of caller and callee in one SCC would follow the 
context order in the profile to favor more inlining. Another benefit is that 
the processing order of caller and callee through an indirect call (which is 
not on the static call graph) can be honored which in turn allows for more 
inlining.

The profile top-down order for SCC is also extended to support non-CS profiles.

Two switches `-mllvm -use-profile-indirect-call-edges` and `-mllvm 
-use-profile-top-down-order` are being introduced.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D95988

Added: 
llvm/test/Transforms/SampleProfile/Inputs/profile-context-order.prof
llvm/test/Transforms/SampleProfile/Inputs/profile-topdown-order.prof
llvm/test/Transforms/SampleProfile/profile-context-order.ll
llvm/test/Transforms/SampleProfile/profile-topdown-order.ll

Modified: 
llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
llvm/lib/Transforms/IPO/SampleContextTracker.cpp
llvm/lib/Transforms/IPO/SampleProfile.cpp

Removed: 




diff  --git a/llvm/include/llvm/Transforms/IPO/SampleContextTracker.h 
b/llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
index 526e141838c4..da0bdae0eaee 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
@@ -18,6 +18,7 @@
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/StringMap.h"
 #include "llvm/ADT/StringRef.h"
+#include "llvm/Analysis/CallGraph.h"
 #include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/IR/Instructions.h"
 #include "llvm/ProfileData/SampleProf.h"
@@ -90,6 +91,8 @@ class ContextTrieNode {
 // calling context and the context is identified by path from root to the node.
 class SampleContextTracker {
 public:
+  using ContextSamplesTy = SmallSet;
+
   SampleContextTracker(StringMap &Profiles);
   // Query context profile for a specific callee with given name at a given
   // call-site. The full context is identified by location of call instruction.
@@ -103,6 +106,9 @@ class SampleContextTracker {
   FunctionSamples *getContextSamplesFor(const DILocation *DIL);
   // Query context profile for a given sample contxt of a function.
   FunctionSamples *getContextSamplesFor(const SampleContext &Context);
+  // Get all context profile for given function.
+  ContextSamplesTy &getAllContextSamplesFor(const Function &Func);
+  ContextSamplesTy &getAllContextSamplesFor(StringRef Name);
   // Query base profile for a given function. A base profile is a merged view
   // of all context profiles for contexts that are not inlined.
   FunctionSamples *getBaseSamplesFor(const Function &Func,
@@ -113,6 +119,9 @@ class SampleContextTracker {
   // This makes sure that inlined context profile will be excluded in
   // function's base profile.
   void markContextSamplesInlined(const FunctionSamples *InlinedSamples);
+  void promoteMergeContextSamplesTree(const Instruction &Inst,
+  StringRef CalleeName);
+  void addCallGraphEdges(CallGraph &CG, StringMap &SymbolMap);
   // Dump the internal context profile trie.
   void dump();
 
@@ -126,8 +135,6 @@ class SampleContextTracker {
   ContextTrieNode *getTopLevelContextNode(StringRef FName);
   ContextTrieNode &addTopLevelContextNode(StringRef FName);
   ContextTrieNode &promoteMergeContextSamplesTree(ContextTrieNode 
&NodeToPromo);
-  void promoteMergeContextSamplesTree(const Instruction &Inst,
-  StringRef CalleeName);
   void mergeContextNode(ContextTrieNode &FromNode, ContextTrieNode &ToNode,
 StringRef ContextStrToRemove);
   ContextTrieNode &promoteMergeContextSamplesTree(ContextTrieNode &FromNode,
@@ -135,7 +142,7 @@ class SampleContextTracker {
   StringRef 
ContextStrToRemove);
 
   // Map from function name to context profiles (excluding base profile)
-  StringMap> FuncToCtxtProfileSet;
+  StringMap FuncToCtxtProfileSet;
 
   // Root node for context trie tree
   ContextTrieNode RootContext;

diff  --git a/llvm/lib/Transforms/IPO/SampleContextTracker.cpp 
b/llvm/lib/Transforms/IPO/SampleContextTracker.cpp
index 41d7f363e1a4..158fa0771c3b 100644
--- a/llvm/lib/Transforms/IPO/SampleContextTracker.cpp
+++ 

[llvm-branch-commits] [llvm] 989b5c9 - Remove test code that cause MSAN failure.

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Hongtao Yu
Date: 2021-02-19T21:21:12-08:00
New Revision: 989b5c9571922ddfecae78a1351d0c801bfbf97b

URL: 
https://github.com/llvm/llvm-project/commit/989b5c9571922ddfecae78a1351d0c801bfbf97b
DIFF: 
https://github.com/llvm/llvm-project/commit/989b5c9571922ddfecae78a1351d0c801bfbf97b.diff

LOG: Remove test code that cause MSAN failure.

Summary:
The negative test (with the feature being added disabled) caused MSAN failure 
and that's the added feature is supposed to fix. Therefore the negative test 
code is being removed.

Added: 


Modified: 
llvm/test/Transforms/SampleProfile/profile-context-order.ll

Removed: 




diff  --git a/llvm/test/Transforms/SampleProfile/profile-context-order.ll 
b/llvm/test/Transforms/SampleProfile/profile-context-order.ll
index a75dcc2179ca..c99cc15850b7 100644
--- a/llvm/test/Transforms/SampleProfile/profile-context-order.ll
+++ b/llvm/test/Transforms/SampleProfile/profile-context-order.ll
@@ -16,7 +16,6 @@
 ;; considered, thus the order becomes (_Z5funcAi, _Z3fibi) which leads to
 ;; _Z3fibi inlined into _Z5funcAi.
 ; RUN: opt < %s -passes=sample-profile -use-profile-indirect-call-edges=1 
-sample-profile-file=%S/Inputs/profile-context-order.prof -S | FileCheck %s 
-check-prefix=ICALL-INLINE
-; RUN: opt < %s -passes=sample-profile -use-profile-indirect-call-edges=0 
-sample-profile-file=%S/Inputs/profile-context-order.prof -S | FileCheck %s 
-check-prefix=ICALL-NOINLINE
 
 @factor = dso_local global i32 3, align 4, !dbg !0
 @fp = dso_local global i32 (i32)* null, align 8
@@ -48,9 +47,6 @@ for.body: ; preds = 
%for.body, %entry
 ; NOINLINE: call i32 @_Z8funcLeafi
 ; ICALL-INLINE: define dso_local i32 @_Z5funcAi
 ; ICALL-INLINE: call i32 @_Z3foo
-; ICALL-NOINLINE: define dso_local i32 @_Z5funcAi
-; ICALL-NOINLINE-NO: call i32 @_Z3foo
-; ICALL-NOINLINE-NO: call i32 @_Z3fibi
 define dso_local i32 @_Z5funcAi(i32 %x) local_unnamed_addr #0 !dbg !40 {
 entry:
   %add = add nsw i32 %x, 10, !dbg !44



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] beb80ff - [CSSPGO][llvm-profgen] Add brackets for context id to support extended binary format

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: wlei
Date: 2021-02-19T21:21:12-08:00
New Revision: beb80ffee6a1a816cfeb4047926f412c1a2456d9

URL: 
https://github.com/llvm/llvm-project/commit/beb80ffee6a1a816cfeb4047926f412c1a2456d9
DIFF: 
https://github.com/llvm/llvm-project/commit/beb80ffee6a1a816cfeb4047926f412c1a2456d9.diff

LOG: [CSSPGO][llvm-profgen] Add brackets for context id to support extended 
binary format

To align with https://reviews.llvm.org/D95547, we need to add brackets for 
context id before initializing the `SampleContext`.

Also added test cases for extended binary format from llvm-profgen side.

Differential Revision: https://reviews.llvm.org/D95929

Added: 
llvm/test/tools/llvm-profgen/cs-extbinary.test

Modified: 
llvm/lib/ProfileData/SampleProfWriter.cpp
llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
llvm/test/tools/llvm-profgen/recursion-compression-noprobe.test
llvm/test/tools/llvm-profgen/recursion-compression-pseudoprobe.test
llvm/tools/llvm-profgen/ProfileGenerator.cpp
llvm/tools/llvm-profgen/ProfileGenerator.h
llvm/tools/llvm-profgen/ProfiledBinary.cpp

Removed: 




diff  --git a/llvm/lib/ProfileData/SampleProfWriter.cpp 
b/llvm/lib/ProfileData/SampleProfWriter.cpp
index b388b78dfaca..8017f2a82804 100644
--- a/llvm/lib/ProfileData/SampleProfWriter.cpp
+++ b/llvm/lib/ProfileData/SampleProfWriter.cpp
@@ -360,10 +360,7 @@ std::error_code SampleProfileWriterCompactBinary::write(
 /// it needs to be parsed by the SampleProfileReaderText class.
 std::error_code SampleProfileWriterText::writeSample(const FunctionSamples &S) 
{
   auto &OS = *OutputStream;
-  if (FunctionSamples::ProfileIsCS)
-OS << "[" << S.getNameWithContext() << "]:" << S.getTotalSamples();
-  else
-OS << S.getName() << ":" << S.getTotalSamples();
+  OS << S.getNameWithContext(true) << ":" << S.getTotalSamples();
   if (Indent == 0)
 OS << ":" << S.getHeadSamples();
   OS << "\n";

diff  --git a/llvm/test/tools/llvm-profgen/cs-extbinary.test 
b/llvm/test/tools/llvm-profgen/cs-extbinary.test
new file mode 100644
index ..8acce173d405
--- /dev/null
+++ b/llvm/test/tools/llvm-profgen/cs-extbinary.test
@@ -0,0 +1,14 @@
+; test for dwarf-based cs profile
+; RUN: llvm-profgen --format=extbinary 
--perfscript=%S/Inputs/recursion-compression-noprobe.perfscript 
--binary=%S/Inputs/recursion-compression-noprobe.perfbin --output=%t1 
--csprof-cold-thres=0
+; RUN: llvm-profdata merge --sample --text --output=%t2 %t1
+; RUN: FileCheck %S/recursion-compression-noprobe.test --input-file %t2
+; RUN: llvm-profdata merge --sample --extbinary --output=%t3 %t2 && 
llvm-profdata merge --sample --text --output=%t4 %t3
+; RUN: 
diff  -b %t2 %t4
+
+
+; test for probe-based cs profile
+; RUN: llvm-profgen --format=extbinary 
--perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript 
--binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t5 
--csprof-cold-thres=0
+; RUN: llvm-profdata merge --sample --text --output=%t6 %t5
+; RUN: FileCheck %S/recursion-compression-pseudoprobe.test --input-file %t6
+; RUN: llvm-profdata merge --sample --extbinary --output=%t7 %t6 && 
llvm-profdata merge --sample --text --output=%t8 %t7
+; RUN: 
diff  -b %t6 %t8

diff  --git a/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test 
b/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
index 943832ebef10..d8cc1932f877 100644
--- a/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
+++ b/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
@@ -2,11 +2,11 @@
 ; RUN: FileCheck %s --input-file %t
 
 ; CHECK:[main:1 @ foo]:44:0
-; CHECK: 2.2: 14
+; CHECK: 2.1: 14
 ; CHECK: 3: 15
-; CHECK: 3.2: 14 bar:14
-; CHECK: 3.4: 1
-; CHECK:[main:1 @ foo:3.2 @ bar]:14:0
+; CHECK: 3.1: 14 bar:14
+; CHECK: 3.2: 1
+; CHECK:[main:1 @ foo:3.1 @ bar]:14:0
 ; CHECK: 1: 14
 
 ; CHECK-UNWINDER: Binary(inline-cs-noprobe.perfbin)'s Range Counter:
@@ -15,10 +15,9 @@
 ; CHECK-UNWINDER:   (67e, 69b): 1
 ; CHECK-UNWINDER:   (67e, 6ad): 13
 ; CHECK-UNWINDER:   (6bd, 6c8): 14
-; CHECK-UNWINDER: main:1 @ foo:3.2 @ bar
+; CHECK-UNWINDER: main:1 @ foo:3.1 @ bar
 ; CHECK-UNWINDER:   (6af, 6bb): 14
 
-
 ; CHECK-UNWINDER: Binary(inline-cs-noprobe.perfbin)'s Branch Counter:
 ; CHECK-UNWINDER: main:1 @ foo
 ; CHECK-UNWINDER:   (69b, 670): 1

diff  --git a/llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test 
b/llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
index 2e60883afa62..9d5c787e7f92 100644
--- a/llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
+++ b/llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
@@ -36,6 +36,8 @@
 
 
 
+
+
 ; original code:
 ; clang -O0 -g test.c -o a.out
 #include 

diff  --git a/llvm/test/tools/llvm-profgen/recursion-compression-noprobe.test 
b/llvm/test/tools/llvm-profgen/recursion-compression-noprobe.test
index 43f495398bb0..03bab8407435 100644
--- a/llvm/test/tools/llv

[llvm-branch-commits] [llvm] 66873fb - [CSSPGO][llvm-profgen] Renovate perfscript check and command line input validation

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: wlei
Date: 2021-02-19T21:21:13-08:00
New Revision: 66873fb695370f5bd333e327ec77e4710c7891c2

URL: 
https://github.com/llvm/llvm-project/commit/66873fb695370f5bd333e327ec77e4710c7891c2
DIFF: 
https://github.com/llvm/llvm-project/commit/66873fb695370f5bd333e327ec77e4710c7891c2.diff

LOG: [CSSPGO][llvm-profgen] Renovate perfscript check and command line input 
validation

This include some changes related with PerfReader's the input check and command 
line change:

1) It appears there might be thousands of leading MMAP-Event line in the 
perfscript for large workload. For this case, the 4k threshold is not eligible 
to determine it's a hybrid sample. This change renovated the 
`isHybridPerfScript` by going through the script without threshold limitation 
checking whether there is a non-empty call stack immediately followed by a LBR 
sample. It will stop once it find a valid one.

2) Added several input validations for the command line switches in PerfReader.

3) Changed the command line `show-disassembly` to `show-disassembly-only`, it 
will print to stdout and exit early which leave an empty output profile.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D96387

Added: 
llvm/test/tools/llvm-profgen/invalid-perfscript.test

Modified: 
llvm/test/tools/llvm-profgen/disassemble.s
llvm/test/tools/llvm-profgen/pseudoprobe-decoding.test
llvm/test/tools/llvm-profgen/symbolize.ll
llvm/tools/llvm-profgen/PerfReader.cpp
llvm/tools/llvm-profgen/PerfReader.h
llvm/tools/llvm-profgen/ProfileGenerator.cpp
llvm/tools/llvm-profgen/ProfiledBinary.cpp
llvm/tools/llvm-profgen/llvm-profgen.cpp

Removed: 




diff  --git a/llvm/test/tools/llvm-profgen/disassemble.s 
b/llvm/test/tools/llvm-profgen/disassemble.s
index fc85fbe967e0..be03b5a6892b 100644
--- a/llvm/test/tools/llvm-profgen/disassemble.s
+++ b/llvm/test/tools/llvm-profgen/disassemble.s
@@ -1,6 +1,6 @@
 # REQUIRES: x86-registered-target
 # RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t
-# RUN: llvm-profgen --binary=%t --perfscript=%s --output=%t1 -show-disassembly 
-x86-asm-syntax=intel | FileCheck %s --match-full-lines
+# RUN: llvm-profgen --binary=%t --perfscript=%s --output=%t1 
-show-disassembly-only -x86-asm-syntax=intel | FileCheck %s --match-full-lines
 
 # CHECK: Disassembly of section .text [0x0, 0x66]:
 # CHECK: :

diff  --git a/llvm/test/tools/llvm-profgen/invalid-perfscript.test 
b/llvm/test/tools/llvm-profgen/invalid-perfscript.test
new file mode 100644
index ..d795f85b1ea3
--- /dev/null
+++ b/llvm/test/tools/llvm-profgen/invalid-perfscript.test
@@ -0,0 +1,9 @@
+; RUN: llvm-profgen --perfscript=%s 
--binary=%S/Inputs/noinline-cs-noprobe.perfbin --output=%t 2>%t1
+; RUN: FileCheck %s --input-file %t1
+
+ 4005dc
+ 400634
+ 400684
+   7f68c5788793
+
+; XFAIL: *

diff  --git a/llvm/test/tools/llvm-profgen/pseudoprobe-decoding.test 
b/llvm/test/tools/llvm-profgen/pseudoprobe-decoding.test
index 5feaa97032ab..1d93a06d8e42 100644
--- a/llvm/test/tools/llvm-profgen/pseudoprobe-decoding.test
+++ b/llvm/test/tools/llvm-profgen/pseudoprobe-decoding.test
@@ -1,4 +1,4 @@
-; RUN: llvm-profgen --perfscript=%s  
--binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t 
--show-pseudo-probe --show-disassembly | FileCheck %s
+; RUN: llvm-profgen --perfscript=%s  
--binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t 
--show-pseudo-probe --show-disassembly-only | FileCheck %s
 
 PERF_RECORD_MMAP2 2854748/2854748: [0x40(0x1000) @ 0 00:1d 123291722 
526021]: r-xp /home/inline-cs-pseudoprobe.perfbin
 

diff  --git a/llvm/test/tools/llvm-profgen/symbolize.ll 
b/llvm/test/tools/llvm-profgen/symbolize.ll
index 2fbc59e3d00d..9a436dec4c20 100644
--- a/llvm/test/tools/llvm-profgen/symbolize.ll
+++ b/llvm/test/tools/llvm-profgen/symbolize.ll
@@ -1,6 +1,6 @@
 ; REQUIRES: x86-registered-target
 ; RUN: llc -filetype=obj %s -o %t
-; RUN: llvm-profgen --binary=%t --perfscript=%s --output=%t1 
--show-disassembly -x86-asm-syntax=intel --show-source-locations | FileCheck %s 
--match-full-lines
+; RUN: llvm-profgen --binary=%t --perfscript=%s --output=%t1 
--show-disassembly-only -x86-asm-syntax=intel --show-source-locations | 
FileCheck %s --match-full-lines
 
 ; CHECK: Disassembly of section .text [0x0, 0x4a]:
 ; CHECK: :

diff  --git a/llvm/tools/llvm-profgen/PerfReader.cpp 
b/llvm/tools/llvm-profgen/PerfReader.cpp
index e59d8d93381b..2e0b71f38e6d 100644
--- a/llvm/tools/llvm-profgen/PerfReader.cpp
+++ b/llvm/tools/llvm-profgen/PerfReader.cpp
@@ -17,6 +17,9 @@ static cl::opt 
ShowUnwinderOutput("show-unwinder-output",
 cl::ZeroOrMore,
 cl::desc("Print unwinder output"));
 
+extern cl::opt ShowDisassemblyOnly;
+extern cl::opt ShowSourceLocations;
+
 namespace llvm {
 namespace s

[llvm-branch-commits] [llvm] 610b51c - [CSSPGO][llvm-profgen] Filter out the instructions without location info for symbolizer

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: wlei
Date: 2021-02-19T21:21:13-08:00
New Revision: 610b51c04d3ca6b58555fa30ae52ad9762f9cf86

URL: 
https://github.com/llvm/llvm-project/commit/610b51c04d3ca6b58555fa30ae52ad9762f9cf86
DIFF: 
https://github.com/llvm/llvm-project/commit/610b51c04d3ca6b58555fa30ae52ad9762f9cf86.diff

LOG: [CSSPGO][llvm-profgen] Filter out the instructions without location info 
for symbolizer

It appears some instructions doesn't have the debug location info and the 
symbolizer will return an empty call stack for them which will cause some crash 
later in profile unwinding. Actually we do not record the sample info for them, 
so this change just filter out those instruction.

As those instruction would appears at the begin and end of the instruction 
list, without them we need to add the boundary check for IP `advance` and 
`backward`.

Also for pseudo probe based profile, we actually don't need the symbolized 
location info, so here just change to use an empty stack for it. This could 
save half of the binary loading time.

Differential Revision: https://reviews.llvm.org/D96434

Added: 


Modified: 
llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
llvm/test/tools/llvm-profgen/recursion-compression-noprobe.test
llvm/tools/llvm-profgen/PerfReader.cpp
llvm/tools/llvm-profgen/ProfileGenerator.cpp
llvm/tools/llvm-profgen/ProfiledBinary.cpp
llvm/tools/llvm-profgen/ProfiledBinary.h

Removed: 




diff  --git a/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test 
b/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
index d8cc1932f877..cb562e347a3e 100644
--- a/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
+++ b/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
@@ -1,12 +1,12 @@
 ; RUN: llvm-profgen --perfscript=%S/Inputs/inline-cs-noprobe.perfscript 
--binary=%S/Inputs/inline-cs-noprobe.perfbin --output=%t --show-unwinder-output 
--csprof-cold-thres=0 | FileCheck %s --check-prefix=CHECK-UNWINDER
 ; RUN: FileCheck %s --input-file %t
 
-; CHECK:[main:1 @ foo]:44:0
+; CHECK:[main:1 @ foo]:309:0
 ; CHECK: 2.1: 14
 ; CHECK: 3: 15
 ; CHECK: 3.1: 14 bar:14
 ; CHECK: 3.2: 1
-; CHECK:[main:1 @ foo:3.1 @ bar]:14:0
+; CHECK:[main:1 @ foo:3.1 @ bar]:84:0
 ; CHECK: 1: 14
 
 ; CHECK-UNWINDER: Binary(inline-cs-noprobe.perfbin)'s Range Counter:

diff  --git a/llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test 
b/llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
index 9d5c787e7f92..c5e6dcca 100644
--- a/llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
+++ b/llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
@@ -1,15 +1,15 @@
 ; RUN: llvm-profgen --perfscript=%S/Inputs/noinline-cs-noprobe.perfscript 
--binary=%S/Inputs/noinline-cs-noprobe.perfbin --output=%t 
--show-unwinder-output --csprof-cold-thres=0 | FileCheck %s 
--check-prefix=CHECK-UNWINDER
 ; RUN: FileCheck %s --input-file %t
 
-; CHECK:[main:1 @ foo:3 @ bar]:12:3
+; CHECK:[main:1 @ foo]:54:0
+; CHECK: 2: 3
+; CHECK: 3: 3 bar:3
+; CHECK:[main:1 @ foo:3 @ bar]:50:3
 ; CHECK: 0: 3
 ; CHECK: 1: 3
 ; CHECK: 2: 2
 ; CHECK: 4: 1
 ; CHECK: 5: 3
-; CHECK:[main:1 @ foo]:6:0
-; CHECK: 2: 3
-; CHECK: 3: 3 bar:3
 
 ; CHECK-UNWINDER: Binary(noinline-cs-noprobe.perfbin)'s Range Counter:
 ; CHECK-UNWINDER: main:1 @ foo

diff  --git a/llvm/test/tools/llvm-profgen/recursion-compression-noprobe.test 
b/llvm/test/tools/llvm-profgen/recursion-compression-noprobe.test
index 03bab8407435..15bdd870879e 100644
--- a/llvm/test/tools/llvm-profgen/recursion-compression-noprobe.test
+++ b/llvm/test/tools/llvm-profgen/recursion-compression-noprobe.test
@@ -4,38 +4,38 @@
 ; RUN: llvm-profgen 
--perfscript=%S/Inputs/recursion-compression-noprobe.perfscript 
--binary=%S/Inputs/recursion-compression-noprobe.perfbin --output=%t 
--csprof-cold-thres=0
 ; RUN: FileCheck %s --input-file %t
 
-; CHECK-UNCOMPRESS:[main:1 @ foo:3 @ fa]:14:0
-; CHECK-UNCOMPRESS: 1: 1
-; CHECK-UNCOMPRESS: 2: 13 fb:11
-; CHECK-UNCOMPRESS:[main:1 @ foo:3 @ fa:2 @ fb]:12:0
+; CHECK-UNCOMPRESS:[main:1 @ foo:3 @ fa:2 @ fb]:48:0
 ; CHECK-UNCOMPRESS: 1: 11
 ; CHECK-UNCOMPRESS: 2: 1 fa:1
-; CHECK-UNCOMPRESS:[main:1 @ foo]:3:0
+; CHECK-UNCOMPRESS:[main:1 @ foo:3 @ fa]:24:0
+; CHECK-UNCOMPRESS: 1: 1
+; CHECK-UNCOMPRESS: 2: 13 fb:11
+; CHECK-UNCOMPRESS:[main:1 @ foo]:7:0
 ; CHECK-UNCOMPRESS: 2: 1
 ; CHECK-UNCOMPRESS: 3: 2 fa:1
-; CHECK-UNCOMPRESS:[main:1 @ foo:3 @ fa:2 @ fb:2 @ fa]:3:0
+; CHECK-UNCOMPRESS:[main:1 @ foo:3 @ fa:2 @ fb:2 @ fa]:7:0
 ; CHECK-UNCOMPRESS: 1: 1
 ; CHECK-UNCOMPRESS: 2: 2 fb:1
-; CHECK-UNCOMPRESS:[main:1 @ foo:3 @ fa:2 @ fb:2 @ fa:2 @ fb]:1:0
+; CHECK-UNCOMPRESS:[main:1 @ foo:3 @ fa:2 @ fb:2 @ fa:2 @ fb]:2:0
 ; CHECK-UNCOMPRESS: 2: 1 fa:1
-; CHECK-UNCOMPRESS:[main:1 @ foo:3 @ fa:2 @ fb:2 @ fa:2 @ fb:2 @ fa]:1:0
+; CHECK-UNCOMPRESS:[main:1 @ foo:3 @ fa:2 @ fb:2 @ fa:2 @ fb:2 @ fa]:2:0
 ; CHECK-UNCOMPRESS: 4: 1
 
 
-; CHECK: [main:1

[llvm-branch-commits] [clang] b5b3111 - [clang] Add -ffinite-loops & -fno-finite-loops options.

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Florian Hahn
Date: 2021-02-19T21:48:47-08:00
New Revision: b5b31112bf63debaa905e42785317a947e696252

URL: 
https://github.com/llvm/llvm-project/commit/b5b31112bf63debaa905e42785317a947e696252
DIFF: 
https://github.com/llvm/llvm-project/commit/b5b31112bf63debaa905e42785317a947e696252.diff

LOG: [clang] Add -ffinite-loops & -fno-finite-loops options.

This cherry-picks the following patches on the release branch:

6280bb4cd80e [clang] Remove redundant condition (NFC).
51bf4c0e6d4c [clang] Add -ffinite-loops & -fno-finite-loops options.
fb4d8fe80701 [clang] Update mustprogress tests

This patch adds 2 new options to control when Clang adds `mustprogress`:

  1. -ffinite-loops: assume all loops are finite; mustprogress is added
 to all loops, regardless of the selected language standard.
  2. -fno-finite-loops: assume no loop is finite; mustprogress is not
 added to any loop or function. We could add mustprogress to
 functions without loops, but we would have to detect that in Clang,
 which is probably not worth it.

Differential Revision: https://reviews.llvm.org/D96850

Added: 
clang/test/CodeGen/attr-mustprogress.c
clang/test/CodeGenCXX/attr-mustprogress.cpp

Modified: 
clang/include/clang/Basic/CodeGenOptions.def
clang/include/clang/Basic/CodeGenOptions.h
clang/include/clang/Driver/Options.td
clang/lib/CodeGen/CodeGenFunction.h
clang/lib/Driver/ToolChains/Clang.cpp
clang/lib/Frontend/CompilerInvocation.cpp

Removed: 
clang/test/CodeGen/attr-mustprogress-0.c
clang/test/CodeGen/attr-mustprogress-0.cpp
clang/test/CodeGen/attr-mustprogress-1.c
clang/test/CodeGen/attr-mustprogress-1.cpp



diff  --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index 5c8af65326ed..9d53b5b923bb 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -266,6 +266,9 @@ CODEGENOPT(VectorizeLoop , 1, 0) ///< Run loop 
vectorizer.
 CODEGENOPT(VectorizeSLP  , 1, 0) ///< Run SLP vectorizer.
 CODEGENOPT(ProfileSampleAccurate, 1, 0) ///< Sample profile is accurate.
 
+/// Treat loops as finite: language, always, never.
+ENUM_CODEGENOPT(FiniteLoops, FiniteLoopsKind, 2, FiniteLoopsKind::Language)
+
   /// Attempt to use register sized accesses to bit-fields in structures, when
   /// possible.
 CODEGENOPT(UseRegisterSizedBitfieldAccess , 1, 0)

diff  --git a/clang/include/clang/Basic/CodeGenOptions.h 
b/clang/include/clang/Basic/CodeGenOptions.h
index 73d41e3293c6..c550817f0f69 100644
--- a/clang/include/clang/Basic/CodeGenOptions.h
+++ b/clang/include/clang/Basic/CodeGenOptions.h
@@ -140,6 +140,12 @@ class CodeGenOptions : public CodeGenOptionsBase {
 All, // Keep all frame pointers.
   };
 
+  enum FiniteLoopsKind {
+Language, // Not specified, use language standard.
+Always,   // All loops are assumed to be finite.
+Never,// No loop is assumed to be finite.
+  };
+
   /// The code model to use (-mcmodel).
   std::string CodeModel;
 

diff  --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1f6c13d5cc96..817798926650 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2410,6 +2410,11 @@ def fno_unroll_loops : Flag<["-"], "fno-unroll-loops">, 
Group,
 defm reroll_loops : BoolFOption<"reroll-loops",
   CodeGenOpts<"RerollLoops">, DefaultFalse,
   PosFlag, NegFlag>;
+def ffinite_loops: Flag<["-"],  "ffinite-loops">, Group,
+  HelpText<"Assume all loops are finite.">, Flags<[CC1Option]>;
+def fno_finite_loops: Flag<["-"], "fno-finite-loops">, Group,
+  HelpText<"Do not assume that any loop is finite.">, Flags<[CC1Option]>;
+
 def ftrigraphs : Flag<["-"], "ftrigraphs">, Group,
   HelpText<"Process trigraph sequences">, Flags<[CC1Option]>;
 def fno_trigraphs : Flag<["-"], "fno-trigraphs">, Group,

diff  --git a/clang/lib/CodeGen/CodeGenFunction.h 
b/clang/lib/CodeGen/CodeGenFunction.h
index 8eb7adbc8fcb..95c0b7b4d7c0 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -507,12 +507,23 @@ class CodeGenFunction : public CodeGenTypeCache {
 
   /// True if the C++ Standard Requires Progress.
   bool CPlusPlusWithProgress() {
+if (CGM.getCodeGenOpts().getFiniteLoops() ==
+CodeGenOptions::FiniteLoopsKind::Never)
+  return false;
+
 return getLangOpts().CPlusPlus11 || getLangOpts().CPlusPlus14 ||
getLangOpts().CPlusPlus17 || getLangOpts().CPlusPlus20;
   }
 
   /// True if the C Standard Requires Progress.
   bool CWithProgress() {
+if (CGM.getCodeGenOpts().getFiniteLoops() ==
+CodeGenOptions::FiniteLoopsKind::Always)
+  return true;
+if (CGM.getCodeGenOpts().getFiniteLoops() ==
+CodeGenOptions::FiniteLoopsKind::Never)
+  return false;
+
 return getLangOpts().C11 ||

[llvm-branch-commits] [llvm] bdafd28 - [SROA] Amend failing test from D95826

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: William S. Moses
Date: 2021-02-19T21:57:18-08:00
New Revision: bdafd284b291436d3fa4644f585efe3b06363554

URL: 
https://github.com/llvm/llvm-project/commit/bdafd284b291436d3fa4644f585efe3b06363554
DIFF: 
https://github.com/llvm/llvm-project/commit/bdafd284b291436d3fa4644f585efe3b06363554.diff

LOG: [SROA] Amend failing test from D95826

(cherry picked from commit 892d2822b62ebcaa7aa0b006b5ea4f26593c1618)

Added: 


Modified: 
llvm/test/Transforms/SROA/tbaa-struct2.ll

Removed: 




diff  --git a/llvm/test/Transforms/SROA/tbaa-struct2.ll 
b/llvm/test/Transforms/SROA/tbaa-struct2.ll
index 75f72f4e9963..13075dd84326 100644
--- a/llvm/test/Transforms/SROA/tbaa-struct2.ll
+++ b/llvm/test/Transforms/SROA/tbaa-struct2.ll
@@ -35,8 +35,8 @@ define double @bar(%struct.Wishart* %wishart) {
 ; CHECK-NEXT:   %tmp.sroa.2.0.copyload = load i32, i32* 
%tmp.sroa.2.0.waddr.sroa_idx1, align 8, !tbaa.struct !7
 ; CHECK-NEXT:   %tmp.sroa.3.0.waddr.sroa_raw_cast = bitcast %struct.Wishart* 
%wishart to i8*
 ; CHECK-NEXT:   %tmp.sroa.3.0.waddr.sroa_raw_idx = getelementptr inbounds i8, 
i8* %tmp.sroa.3.0.waddr.sroa_raw_cast, i64 12
-; CHECK-NEXT:   %tmp.sroa.3.0.tmpaddr.sroa_idx = getelementptr inbounds [4 x 
i8], [4 x i8]* %tmp.sroa.3, i64 0, i64 0
-; CHECK-NEXT:   call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 
%tmp.sroa.3.0.tmpaddr.sroa_idx, i8* align 4 %tmp.sroa.3.0.waddr.sroa_raw_idx, 
i64 4, i1 false), !tbaa.struct !8
+; CHECK-NEXT:   %[[sroa_idx:.+]] = getelementptr inbounds [4 x i8], [4 x i8]* 
%tmp.sroa.3, i64 0, i64 0
+; CHECK-NEXT:   call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 
%[[sroa_idx]], i8* align 4 %tmp.sroa.3.0.waddr.sroa_raw_idx, i64 4, i1 false), 
!tbaa.struct !8
 ; CHECK-NEXT:   %call = call double @subcall(double %tmp.sroa.0.0.copyload, 
i32 %tmp.sroa.2.0.copyload)
 ; CHECK-NEXT:   ret double %call
 ; CHECK-NEXT: }
@@ -48,4 +48,4 @@ define double @bar(%struct.Wishart* %wishart) {
 ; CHECK: !5 = !{!6, !6, i64 0}
 ; CHECK: !6 = !{!"int", !{{[0-9]+}}, i64 0}
 ; CHECK: !7 = !{i64 0, i64 4, !5}
-; CHECK: !8 = !{}
\ No newline at end of file
+; CHECK: !8 = !{}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] ee7eaf8 - [llvm-objdump] --source: drop the warning when there is no debug info

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Fangrui Song
Date: 2021-02-19T22:09:42-08:00
New Revision: ee7eaf860cde24a91a5b17390b8ad5eddd05f7f9

URL: 
https://github.com/llvm/llvm-project/commit/ee7eaf860cde24a91a5b17390b8ad5eddd05f7f9
DIFF: 
https://github.com/llvm/llvm-project/commit/ee7eaf860cde24a91a5b17390b8ad5eddd05f7f9.diff

LOG: [llvm-objdump] --source: drop the warning when there is no debug info

Warnings have been added for three cases (PR41905): (1) missing debug info, (2)
the source file cannot be found, (3) the debug info points at a line beyond the
end of the file.

(1) is probably less useful. This was brought up once on
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141264.html and two
internal users mentioned it to me that it was annoying. (I personally
find the warning confusing, too.)

Users specify --source to get additional information if sources happen to be
available.  If sources are not available, it should be obvious as the output
will have no interleaved source lines. The warning can be especially annoying
when using llvm-objdump -S on a bunch of files.

This patch drops the warning when there is no debug info.
(If LLVMSymbolizer::symbolizeCode returns an `Error`, there will still be
an error. There is currently no test for an `Error` return value.
The only code path is probably a broken symbol table, but we probably already 
emit a warning
in that case)

`source-interleave-prefix.test` has an inappropriate "malformed" test - the 
test simply has no
.debug_* because new llc does not produce debug info when the filename is empty 
(invalid).
I have tried tampering the header of .debug_info/.debug_line but 
llvm-symbolizer does not warn.
This patch does not intend to add the missing test coverage.

Differential Revision: https://reviews.llvm.org/D88715

(cherry picked from commit eecbb1c77655d38c06e47cf32e2dcc72da45c517)

Added: 


Modified: 
llvm/test/tools/llvm-objdump/X86/source-interleave-no-debug-info.test
llvm/test/tools/llvm-objdump/X86/source-interleave-prefix.test
llvm/tools/llvm-objdump/llvm-objdump.cpp

Removed: 




diff  --git 
a/llvm/test/tools/llvm-objdump/X86/source-interleave-no-debug-info.test 
b/llvm/test/tools/llvm-objdump/X86/source-interleave-no-debug-info.test
index 25deaa00243c..89b03d429bfa 100644
--- a/llvm/test/tools/llvm-objdump/X86/source-interleave-no-debug-info.test
+++ b/llvm/test/tools/llvm-objdump/X86/source-interleave-no-debug-info.test
@@ -1,15 +1,13 @@
 ## Test that if an object has no debug information, only the disassembly is
-## printed when --source is specified, and that we emit a warning.
+## printed when --source is specified, and that we do not emit a warning.
 
 # RUN: sed -e "s,SRC_COMPDIR,%/p/Inputs,g" %p/Inputs/source-interleave.ll > 
%t.ll
 # RUN: llc -o %t.o -filetype=obj -mtriple=x86_64-pc-linux %t.ll
 # RUN: llvm-objcopy --strip-debug %t.o %t2.o
 
 # RUN: llvm-objdump --source %t.o | FileCheck %s --check-prefixes=CHECK,SOURCE
-# RUN: llvm-objdump --source %t2.o 2> %t2.e | FileCheck %s 
--check-prefixes=CHECK --implicit-check-not='main()'
-# RUN: FileCheck %s --input-file %t2.e --check-prefixes=WARN
+# RUN: llvm-objdump --source %t2.o 2>&1 | FileCheck %s --check-prefixes=CHECK 
--implicit-check-not='main()' --implicit-check-not=warning:
 
-# WARN:warning: '{{.*}}2.o': failed to parse debug information
 # CHECK:   0010 :
 # SOURCE-NEXT: ; int main() {
 # CHECK-NEXT:   10:   55  pushq   %rbp

diff  --git a/llvm/test/tools/llvm-objdump/X86/source-interleave-prefix.test 
b/llvm/test/tools/llvm-objdump/X86/source-interleave-prefix.test
index b384c49b350e..23ce55a329ac 100644
--- a/llvm/test/tools/llvm-objdump/X86/source-interleave-prefix.test
+++ b/llvm/test/tools/llvm-objdump/X86/source-interleave-prefix.test
@@ -24,15 +24,6 @@
 ; RUN: llvm-objdump --prefix myprefix --source %t-correct-prefix.o 2>&1 | \
 ; RUN:   FileCheck %s --check-prefix=CHECK-BROKEN-PREFIX 
-DFILE=%t-correct-prefix.o -DPREFIX=myprefix%/p
 
-;; Test malformed input.
-
-; RUN: sed -e "s,SRC_COMPDIR,,g" -e "s,filename: 
\"source-interleave-x86_64.c\",filename: \"\",g" \
-; RUN:   %p/Inputs/source-interleave.ll > %t-malformed.ll
-; RUN: llc -o %t-malformed.o -filetype=obj -mtriple=x86_64-pc-linux 
%t-malformed.ll
-; RUN: llvm-objdump --prefix myprefix --source %t-malformed.o 2>&1 | \
-; RUN:   FileCheck %s --check-prefix=CHECK-MALFORMED -DFILE=%t-malformed.o
-; CHECK-MALFORMED: warning: '[[FILE]]': failed to parse debug information for 
[[FILE]]
-
 ;; Using only a prefix separator is the same as not using the `--prefix` 
option.
 
 ; RUN: llvm-objdump --prefix / --source %t-missing-prefix.o 2>&1 | \

diff  --git a/llvm/tools/llvm-objdump/llvm-objdump.cpp 
b/llvm/tools/llvm-objdump/llvm-objdump.cpp
index 3134f989603a..17128e95727f 100644
--- a/llvm/tools/llvm-objdump/llvm-objdump.cpp
+++ b/llvm/tools/llvm-objdump/llvm-objdump.cpp
@@ -947,8 +947,8 @@ class So

[llvm-branch-commits] [openmp] 76d5d54 - Avoid use of stack allocations in asynchronous calls

2021-02-19 Thread Tom Stellard via llvm-branch-commits

Author: Johannes Doerfert
Date: 2021-02-19T22:22:50-08:00
New Revision: 76d5d54f62599d249e0bf2d1b0998451a584c3f3

URL: 
https://github.com/llvm/llvm-project/commit/76d5d54f62599d249e0bf2d1b0998451a584c3f3
DIFF: 
https://github.com/llvm/llvm-project/commit/76d5d54f62599d249e0bf2d1b0998451a584c3f3.diff

LOG: Avoid use of stack allocations in asynchronous calls

NOTE: This is an adaption of the original patch to be applicable to the
  LLVM 12 release branch. Logic is the same though.

As reported by Guilherme Valarini [0], we used to pass stack allocations
to calls that can nowadays be asynchronous. This is arguably a problem
and it will inevitably result in UB. To remedy the situation we allocate
the locations as part of the AsyncInfoTy object. The lifetime of that
object matches what we need for now. If the synchronization is not tied
to the AsyncInfoTy object anymore we might need to have a different
buffer construct in global space.

This should be back-ported to LLVM 12 but needs slight modifications as
it is based on refactoring patches we do not need to backport.

[0] https://lists.llvm.org/pipermail/openmp-dev/2021-February/003867.html

Differential Revision: https://reviews.llvm.org/D96667

Added: 


Modified: 
openmp/libomptarget/include/omptarget.h
openmp/libomptarget/src/omptarget.cpp

Removed: 




diff  --git a/openmp/libomptarget/include/omptarget.h 
b/openmp/libomptarget/include/omptarget.h
index 9c533944d135..46bb8206efa1 100644
--- a/openmp/libomptarget/include/omptarget.h
+++ b/openmp/libomptarget/include/omptarget.h
@@ -14,6 +14,8 @@
 #ifndef _OMPTARGET_H_
 #define _OMPTARGET_H_
 
+#include 
+#include 
 #include 
 #include 
 
@@ -119,10 +121,18 @@ struct __tgt_target_table {
 /// This struct contains information exchanged between 
diff erent asynchronous
 /// operations for device-dependent optimization and potential synchronization
 struct __tgt_async_info {
+  /// Locations we used in (potentially) asynchronous calls which should live
+  /// as long as this AsyncInfoTy object.
+  std::deque BufferLocations;
+
   // A pointer to a queue-like structure where offloading operations are 
issued.
   // We assume to use this structure to do synchronization. In CUDA backend, it
   // is CUstream.
   void *Queue = nullptr;
+
+  /// Return a void* reference with a lifetime that is at least as long as this
+  /// AsyncInfoTy object. The location can be used as intermediate buffer.
+  void *&getVoidPtrLocation();
 };
 
 /// This struct is a record of non-contiguous information

diff  --git a/openmp/libomptarget/src/omptarget.cpp 
b/openmp/libomptarget/src/omptarget.cpp
index e4b7b18bc70b..37150aae2fe6 100644
--- a/openmp/libomptarget/src/omptarget.cpp
+++ b/openmp/libomptarget/src/omptarget.cpp
@@ -18,6 +18,13 @@
 #include 
 #include 
 
+/// Return a void* reference with a lifetime that is at least as long as this
+/// AsyncInfoTy object. The location can be used as intermediate buffer.
+void *&__tgt_async_info::getVoidPtrLocation() {
+  BufferLocations.push_back(nullptr);
+  return BufferLocations.back();
+}
+
 /* All begin addresses for partially mapped structs must be 8-aligned in order
  * to ensure proper alignment of members. E.g.
  *
@@ -415,7 +422,8 @@ int targetDataBegin(ident_t *loc, DeviceTy &Device, int32_t 
arg_num,
   DP("Update pointer (" DPxMOD ") -> [" DPxMOD "]\n",
  DPxPTR(PointerTgtPtrBegin), DPxPTR(TgtPtrBegin));
   uint64_t Delta = (uint64_t)HstPtrBegin - (uint64_t)HstPtrBase;
-  void *TgtPtrBase = (void *)((uint64_t)TgtPtrBegin - Delta);
+  void *&TgtPtrBase = async_info_ptr->getVoidPtrLocation();
+  TgtPtrBase = (void *)((uint64_t)TgtPtrBegin - Delta);
   int rt = Device.submitData(PointerTgtPtrBegin, &TgtPtrBase,
  sizeof(void *), async_info_ptr);
   if (rt != OFFLOAD_SUCCESS) {
@@ -1122,8 +1130,9 @@ static int processDataBefore(ident_t *loc, int64_t 
DeviceId, void *HostPtr,
 DP("Parent lambda base " DPxMOD "\n", DPxPTR(TgtPtrBase));
 uint64_t Delta = (uint64_t)HstPtrBegin - (uint64_t)HstPtrBase;
 void *TgtPtrBegin = (void *)((uintptr_t)TgtPtrBase + Delta);
-void *PointerTgtPtrBegin = Device.getTgtPtrBegin(
-HstPtrVal, ArgSizes[I], IsLast, false, IsHostPtr);
+void *&PointerTgtPtrBegin = AsyncInfo->getVoidPtrLocation();
+PointerTgtPtrBegin = Device.getTgtPtrBegin(HstPtrVal, ArgSizes[I],
+   IsLast, false, IsHostPtr);
 if (!PointerTgtPtrBegin) {
   DP("No lambda captured variable mapped (" DPxMOD ") - ignored\n",
  DPxPTR(HstPtrVal));



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits