[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-14 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin created this revision.
alexey.lapshin added reviewers: bcahoon, kparzysz, aaron.ballman, arphaman, 
rsmith, marksl, yakush.
Herald added a subscriber: zzheng.

  [PIPELINER] add two pragmas to control Software Pipelining optimisation.
  
#pragma clang loop pipeline(disable)
  
Disable SWP optimization for the next loop.
“disable” is the only possible value.
  
#pragma clang loop pipeline_ii_count(number)
  
Set value of initiation interval for SWP
optimization to specified number value for
the next loop. Number is the positive value
greater than 0.
  
These pragmas could be used for debugging or reducing
compile time purposes. It is possible to disable SWP for
concrete loops to save compilation time or to find bugs
by not doing SWP to certain loops. It is possible to set
value of initiation interval to concrete number to save
compilation time by not doing extra pipeliner passes or
to check created schedule for specific initiation interval.
  
That is clang part of the fix


https://reviews.llvm.org/D55710

Files:
  include/clang/Basic/Attr.td
  include/clang/Basic/AttrDocs.td
  include/clang/Basic/DiagnosticParseKinds.td
  lib/CodeGen/CGLoopInfo.cpp
  lib/CodeGen/CGLoopInfo.h
  lib/Parse/ParsePragma.cpp
  lib/Sema/SemaStmtAttr.cpp
  test/CodeGenCXX/pragma-pipeline.cpp
  test/Parser/pragma-loop.cpp
  test/Parser/pragma-pipeline.cpp
  test/Parser/pragma-unroll-and-jam.cpp

Index: test/Parser/pragma-unroll-and-jam.cpp
===
--- test/Parser/pragma-unroll-and-jam.cpp
+++ test/Parser/pragma-unroll-and-jam.cpp
@@ -67,7 +67,7 @@
   }
 
 // pragma clang unroll_and_jam is disabled for the moment
-/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop unroll_and_jam(4)
+/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_ii_count or distribute}} */ #pragma clang loop unroll_and_jam(4)
   for (int i = 0; i < Length; i++) {
 for (int j = 0; j < Length; j++) {
   List[i * Length + j] = Value;
Index: test/Parser/pragma-pipeline.cpp
===
--- /dev/null
+++ test/Parser/pragma-pipeline.cpp
@@ -0,0 +1,43 @@
+// RUN: %clang_cc1 -std=c++11 -verify %s
+
+// Note that this puts the expected lines before the directives to work around
+// limitations in the -verify mode.
+
+void test(int *List, int Length, int Value) {
+  int i = 0;
+
+#pragma clang loop pipeline(disable) 
+  for (int i = 0; i < Length; i++) {
+  List[i] = Value;
+  }
+
+#pragma clang loop pipeline_ii_count(10) 
+  for (int i = 0; i < Length; i++) {
+  List[i] = Value;
+  }
+
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline(disable
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(enable)
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(error)
+/* expected-error {{missing argument; expected an integer value}} */ #pragma clang loop pipeline_ii_count()
+/* expected-error {{use of undeclared identifier 'error'}} */ #pragma clang loop pipeline_ii_count(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline_ii_count 1 2
+  for (int i = 0; i < Length; i++) {
+for (int j = 0; j < Length; j++) {
+  List[i * Length + j] = Value;
+}
+  }
+
+#pragma clang loop pipeline(disable) 
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int j = Length;
+#pragma clang loop pipeline_ii_count(4) 
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int k = Length;
+
+#pragma clang loop pipeline(disable)
+#pragma clang loop pipeline_ii_count(4) /* expected-error {{incompatible directives 'pipeline(disable)' and 'pipeline_ii_count(4)'}} */
+  for (int i = 0; i < Length; i++) {
+  List[i] = Value;
+  }
+
+#pragma clang loop pipeline(disable) 
+/* expected-error {{expected statement}} */ }
Index: test/Parser/pragma-loop.cpp
===
--- test/Parser/pragma-loop.cpp
+++ test/Parser/pragma-loop.cpp
@@ -147,7 +147,7 @@
 /* expected-error {{missing argument; expected 'enable', 'full' or 'disable'}} */ #pragma clang loop unroll()
 /* expected-error {{missing argument; expected 'enable' or 'disable'}} */ #pragma clang loop distribute()
 
-/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop
+/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipelin

[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-14 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin updated this revision to Diff 178264.
alexey.lapshin added a comment.

deleted small typo in CGLoopInfo.cpp


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710

Files:
  include/clang/Basic/Attr.td
  include/clang/Basic/AttrDocs.td
  include/clang/Basic/DiagnosticParseKinds.td
  lib/CodeGen/CGLoopInfo.cpp
  lib/CodeGen/CGLoopInfo.h
  lib/Parse/ParsePragma.cpp
  lib/Sema/SemaStmtAttr.cpp
  test/CodeGenCXX/pragma-pipeline.cpp
  test/Parser/pragma-loop.cpp
  test/Parser/pragma-pipeline.cpp
  test/Parser/pragma-unroll-and-jam.cpp

Index: test/Parser/pragma-unroll-and-jam.cpp
===
--- test/Parser/pragma-unroll-and-jam.cpp
+++ test/Parser/pragma-unroll-and-jam.cpp
@@ -67,7 +67,7 @@
   }
 
 // pragma clang unroll_and_jam is disabled for the moment
-/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop unroll_and_jam(4)
+/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_ii_count or distribute}} */ #pragma clang loop unroll_and_jam(4)
   for (int i = 0; i < Length; i++) {
 for (int j = 0; j < Length; j++) {
   List[i * Length + j] = Value;
Index: test/Parser/pragma-pipeline.cpp
===
--- /dev/null
+++ test/Parser/pragma-pipeline.cpp
@@ -0,0 +1,43 @@
+// RUN: %clang_cc1 -std=c++11 -verify %s
+
+// Note that this puts the expected lines before the directives to work around
+// limitations in the -verify mode.
+
+void test(int *List, int Length, int Value) {
+  int i = 0;
+
+#pragma clang loop pipeline(disable) 
+  for (int i = 0; i < Length; i++) {
+  List[i] = Value;
+  }
+
+#pragma clang loop pipeline_ii_count(10) 
+  for (int i = 0; i < Length; i++) {
+  List[i] = Value;
+  }
+
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline(disable
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(enable)
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(error)
+/* expected-error {{missing argument; expected an integer value}} */ #pragma clang loop pipeline_ii_count()
+/* expected-error {{use of undeclared identifier 'error'}} */ #pragma clang loop pipeline_ii_count(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline_ii_count 1 2
+  for (int i = 0; i < Length; i++) {
+for (int j = 0; j < Length; j++) {
+  List[i * Length + j] = Value;
+}
+  }
+
+#pragma clang loop pipeline(disable) 
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int j = Length;
+#pragma clang loop pipeline_ii_count(4) 
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int k = Length;
+
+#pragma clang loop pipeline(disable)
+#pragma clang loop pipeline_ii_count(4) /* expected-error {{incompatible directives 'pipeline(disable)' and 'pipeline_ii_count(4)'}} */
+  for (int i = 0; i < Length; i++) {
+  List[i] = Value;
+  }
+
+#pragma clang loop pipeline(disable) 
+/* expected-error {{expected statement}} */ }
Index: test/Parser/pragma-loop.cpp
===
--- test/Parser/pragma-loop.cpp
+++ test/Parser/pragma-loop.cpp
@@ -147,7 +147,7 @@
 /* expected-error {{missing argument; expected 'enable', 'full' or 'disable'}} */ #pragma clang loop unroll()
 /* expected-error {{missing argument; expected 'enable' or 'disable'}} */ #pragma clang loop distribute()
 
-/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop
+/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_ii_count or distribute}} */ #pragma clang loop
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop badkeyword
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop badkeyword(enable)
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop vectorize(enable) badkeyword(4)
Index: test/CodeGenCXX/pragma-pipeline.cpp
===
--- /dev/null
+++ test/CodeGenCXX/pragma-pipeline.cpp
@@ -0,0 +1,47 @@
+// RUN: %clang_cc1 -triple hexagon -std=c++11 -emit-llvm -o - %s | FileCheck %s
+
+void pipeline_disabled(int *List, int Length, int Value) {
+  // CHECK-LABEL: define {{.*}} @_Z17pipeline_disabled 
+#pragma clang loop pipeline(disable) 
+  for (int i = 0; i < Length; i++) {
+  // CHECK: br label {{.*}}, !llvm.loop ![[LOOP_1:.*]]
+  List[i] = Value;
+  }
+}
+
+void pipeline_not_disabled(

[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-17 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin marked 2 inline comments as done.
alexey.lapshin added a comment.

Ok, I will address all issues.




Comment at: include/clang/Basic/Attr.td:2871-2872
   /// distribute: attempt to distribute loop if State == Enable
+  /// pipeline: disable pipelining loop if State == Disable
+  /// pipeline_ii_count: try to pipeline loop for only 'Value' value
 

aaron.ballman wrote:
> Missing full stops at the end of the comments. Also, having read "for only 
> 'Value' value", I'm still not certain what's happening. I think this is a 
> count of some kind, so perhaps "Tries to pipeline 'Values' times." but I 
> don't know what the verb "pipeline" means in this context.
> 
> Are users going to understand what the `ii` means in the user-facing name?
As to ii - yes that should be understood by users, because it is important 
property of software pipelining optimization. Software Pipelining optimization 
tries to reorder instructions of original loop(from different iterations) so 
that resulting loop body takes less cycles. It started from some minimal value 
of ii and stopped with some maximal value.  i.e. it tries to built schedule for 
min_ii, then min_ii+1, ... until schedule is found or until max_ii reached.  If 
resulting value of ii already known then instead of searching in range min_ii, 
max_ii - it is possible to create schedule for already known ii. 

probably following spelling would be better : 

pipeline_ii_count: create loop schedule with initiation interval equals 'Value'



Comment at: include/clang/Basic/DiagnosticParseKinds.td:1192
+def err_pragma_pipeline_invalid_keyword : Error<
+"invalid argument; expected 'disable'">;
 

aaron.ballman wrote:
> Can you roll this into the above diagnostic with another `%select`, or does 
> that make it too unreadable?
yes, it makes things too complicated. Though I could do it if necessary.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-17 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin marked an inline comment as done.
alexey.lapshin added inline comments.



Comment at: include/clang/Basic/Attr.td:2871-2872
   /// distribute: attempt to distribute loop if State == Enable
+  /// pipeline: disable pipelining loop if State == Disable
+  /// pipeline_ii_count: try to pipeline loop for only 'Value' value
 

aaron.ballman wrote:
> alexey.lapshin wrote:
> > aaron.ballman wrote:
> > > Missing full stops at the end of the comments. Also, having read "for 
> > > only 'Value' value", I'm still not certain what's happening. I think this 
> > > is a count of some kind, so perhaps "Tries to pipeline 'Values' times." 
> > > but I don't know what the verb "pipeline" means in this context.
> > > 
> > > Are users going to understand what the `ii` means in the user-facing name?
> > As to ii - yes that should be understood by users, because it is important 
> > property of software pipelining optimization. Software Pipelining 
> > optimization tries to reorder instructions of original loop(from different 
> > iterations) so that resulting loop body takes less cycles. It started from 
> > some minimal value of ii and stopped with some maximal value.  i.e. it 
> > tries to built schedule for min_ii, then min_ii+1, ... until schedule is 
> > found or until max_ii reached.  If resulting value of ii already known then 
> > instead of searching in range min_ii, max_ii - it is possible to create 
> > schedule for already known ii. 
> > 
> > probably following spelling would be better : 
> > 
> > pipeline_ii_count: create loop schedule with initiation interval equals 
> > 'Value'
> > because it is important property of software pipelining optimization. 
> 
> My point is that I have no idea what "ii" means and I doubt I'll be alone -- 
> does the "ii" differentiate this from other kinds of pipeline loop pragmas we 
> plan to support, or is expected that "pipeline_ii_count" be the only pipeline 
> count option? Can we drop the "ii" and not lose anything?
> 
> > pipeline_ii_count: create loop schedule with initiation interval equals 
> > 'Value'
> 
> equals 'Value' -> equal to 'Value'
Do you think spelling out  ii would help ? 

pipeline_initiation_interval(number)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-17 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin added a comment.

Ok, So thank you for the suggestions. I will implement that way.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-19 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin updated this revision to Diff 178956.
alexey.lapshin edited the summary of this revision.
alexey.lapshin added a comment.

Please consider this change:

1. addressed style issues and formatted patch with clang-format
2. renamed pragma pipeline_ii_count to pipeline_initiation_interval
3. left LoopAttributes inclass initializers refactoring for some another fix :-)
4. added more tests


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710

Files:
  include/clang/Basic/Attr.td
  include/clang/Basic/AttrDocs.td
  include/clang/Basic/DiagnosticParseKinds.td
  lib/CodeGen/CGLoopInfo.cpp
  lib/CodeGen/CGLoopInfo.h
  lib/Parse/ParsePragma.cpp
  lib/Sema/SemaStmtAttr.cpp
  test/CodeGenCXX/pragma-pipeline.cpp
  test/Parser/pragma-loop.cpp
  test/Parser/pragma-pipeline.cpp
  test/Parser/pragma-unroll-and-jam.cpp

Index: test/Parser/pragma-unroll-and-jam.cpp
===
--- test/Parser/pragma-unroll-and-jam.cpp
+++ test/Parser/pragma-unroll-and-jam.cpp
@@ -67,7 +67,7 @@
   }
 
 // pragma clang unroll_and_jam is disabled for the moment
-/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop unroll_and_jam(4)
+/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_initiation_interval or distribute}} */ #pragma clang loop unroll_and_jam(4)
   for (int i = 0; i < Length; i++) {
 for (int j = 0; j < Length; j++) {
   List[i * Length + j] = Value;
Index: test/Parser/pragma-pipeline.cpp
===
--- /dev/null
+++ test/Parser/pragma-pipeline.cpp
@@ -0,0 +1,47 @@
+// RUN: %clang_cc1 -std=c++11 -verify %s
+
+// Note that this puts the expected lines before the directives to work around
+// limitations in the -verify mode.
+
+void test(int *List, int Length, int Value) {
+  int i = 0;
+
+#pragma clang loop pipeline(disable)
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+#pragma clang loop pipeline_initiation_interval(10)
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline(disable
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(enable)
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline disable
+/* expected-error {{missing argument; expected an integer value}} */ #pragma clang loop pipeline_initiation_interval()
+/* expected-error {{use of undeclared identifier 'error'}} */ #pragma clang loop pipeline_initiation_interval(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline_initiation_interval 1 2
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline_initiation_interval(1
+/* expected-error {{invalid argument of type 'double'; expected an integer type}} */ #pragma clang loop pipeline_initiation_interval(1.0)
+/* expected-error {{invalid value '0'; must be positive}} */ #pragma clang loop pipeline_initiation_interval(0)
+  for (int i = 0; i < Length; i++) {
+for (int j = 0; j < Length; j++) {
+  List[i * Length + j] = Value;
+}
+  }
+
+#pragma clang loop pipeline(disable) 
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int j = Length;
+#pragma clang loop pipeline_initiation_interval(4)
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int k = Length;
+
+#pragma clang loop pipeline(disable)
+#pragma clang loop pipeline_initiation_interval(4) /* expected-error {{incompatible directives 'pipeline(disable)' and 'pipeline_initiation_interval(4)'}} */
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+#pragma clang loop pipeline(disable)
+/* expected-error {{expected statement}} */ }
Index: test/Parser/pragma-loop.cpp
===
--- test/Parser/pragma-loop.cpp
+++ test/Parser/pragma-loop.cpp
@@ -147,7 +147,7 @@
 /* expected-error {{missing argument; expected 'enable', 'full' or 'disable'}} */ #pragma clang loop unroll()
 /* expected-error {{missing argument; expected 'enable' or 'disable'}} */ #pragma clang loop distribute()
 
-/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop
+/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_initiation_interval or distribute}} */ #pragma clang loop
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop badkey

[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-20 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin marked an inline comment as done.
alexey.lapshin added inline comments.



Comment at: include/clang/Basic/AttrDocs.td:2673
+
+Check `Software pipelining`, `Modulo scheduling` to get more details on 
optimisation.
+

aaron.ballman wrote:
> "Check" -- check where? Perhaps "Search for" but I still don't think that's 
> very satisfying for our documentation as there's no telling what users will 
> find or how it relates (if at all) to this feature.
> 
> on optimisation -> on this optimization
> 
> I find this sentence doesn't really direct me to much of value (research 
> presentations and class lectures, mostly), so I'd rather see the important 
> details spelled out in our docs so that users don't have to guess as much.
I am just unsure which references to include here ... links to wikipedia or 
some concrete resources do not look good. 

So you are suggesting to spell things out here... Ok, I will prepare short 
explanation...


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-20 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin marked an inline comment as done.
alexey.lapshin added inline comments.



Comment at: include/clang/Basic/AttrDocs.td:2677-2678
+`language extensions
+`_
+for further details, including limitations of the pipeline hints.
+  }];

aaron.ballman wrote:
> There is nothing in the linked documentation that describes pipeline 
> initiation intervals, so I'm still left wondering how to pick a value for the 
> attribute argument. I think the docs need a bit more detail for what that 
> value means and how it impacts the user's code.
How about this variant :

Software Pipelining optimization is a technique used to optimize loops by
utilizing instructions level parallelism. It reorders loop instructions to
overlap iterations. As the result new iteration started before previous
have finished. The Modulo Scheduling technique creates schedule for one
iteration such that when repeating at regular interval no inter-iteration
dependence violated. This constant interval(in cycles) between the start
of iterations called initiation interval. Cycles number of one iteration
of newly generated loop matches with Initiation Interval. For further
details see https://en.wikipedia.org/wiki/Software_pipelining and
"Swing Modulo Scheduling: A Lifetime-Sensitive Approach", by J. Llosa,
A. Gonzalez, E. Ayguade, and M. Valero.

``#pragma clang loop pipeline`` and `#pragma loop pipeline_initiation_interval``
could be used as hints for Software Pipelining optimization.

Using ``#pragma clang loop pipeline(disable)`` avoids software pipelining 
optimization. The disable state can only be specified:

.. code-block:: c++

  #pragma clang loop pipeline(disable)
  for (...) {
...
  }


Using ``#pragma loop pipeline_initiation_interval``
instructs the software pipeliner to check the specified initiation interval.
If schedule found the result loop iteration would have specified
cycle count:

.. code-block:: c++

 #pragma loop pipeline_initiation_interval(10)
 for (...) {
   ...
 }



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-20 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin updated this revision to Diff 179145.
alexey.lapshin added a comment.

update documentation section.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710

Files:
  include/clang/Basic/Attr.td
  include/clang/Basic/AttrDocs.td
  include/clang/Basic/DiagnosticParseKinds.td
  lib/CodeGen/CGLoopInfo.cpp
  lib/CodeGen/CGLoopInfo.h
  lib/Parse/ParsePragma.cpp
  lib/Sema/SemaStmtAttr.cpp
  test/CodeGenCXX/pragma-pipeline.cpp
  test/Parser/pragma-loop.cpp
  test/Parser/pragma-pipeline.cpp
  test/Parser/pragma-unroll-and-jam.cpp

Index: test/Parser/pragma-unroll-and-jam.cpp
===
--- test/Parser/pragma-unroll-and-jam.cpp
+++ test/Parser/pragma-unroll-and-jam.cpp
@@ -67,7 +67,7 @@
   }
 
 // pragma clang unroll_and_jam is disabled for the moment
-/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop unroll_and_jam(4)
+/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_initiation_interval or distribute}} */ #pragma clang loop unroll_and_jam(4)
   for (int i = 0; i < Length; i++) {
 for (int j = 0; j < Length; j++) {
   List[i * Length + j] = Value;
Index: test/Parser/pragma-pipeline.cpp
===
--- /dev/null
+++ test/Parser/pragma-pipeline.cpp
@@ -0,0 +1,47 @@
+// RUN: %clang_cc1 -std=c++11 -verify %s
+
+// Note that this puts the expected lines before the directives to work around
+// limitations in the -verify mode.
+
+void test(int *List, int Length, int Value) {
+  int i = 0;
+
+#pragma clang loop pipeline(disable)
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+#pragma clang loop pipeline_initiation_interval(10)
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline(disable
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(enable)
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline disable
+/* expected-error {{missing argument; expected an integer value}} */ #pragma clang loop pipeline_initiation_interval()
+/* expected-error {{use of undeclared identifier 'error'}} */ #pragma clang loop pipeline_initiation_interval(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline_initiation_interval 1 2
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline_initiation_interval(1
+/* expected-error {{invalid argument of type 'double'; expected an integer type}} */ #pragma clang loop pipeline_initiation_interval(1.0)
+/* expected-error {{invalid value '0'; must be positive}} */ #pragma clang loop pipeline_initiation_interval(0)
+  for (int i = 0; i < Length; i++) {
+for (int j = 0; j < Length; j++) {
+  List[i * Length + j] = Value;
+}
+  }
+
+#pragma clang loop pipeline(disable) 
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int j = Length;
+#pragma clang loop pipeline_initiation_interval(4)
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int k = Length;
+
+#pragma clang loop pipeline(disable)
+#pragma clang loop pipeline_initiation_interval(4) /* expected-error {{incompatible directives 'pipeline(disable)' and 'pipeline_initiation_interval(4)'}} */
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+#pragma clang loop pipeline(disable)
+/* expected-error {{expected statement}} */ }
Index: test/Parser/pragma-loop.cpp
===
--- test/Parser/pragma-loop.cpp
+++ test/Parser/pragma-loop.cpp
@@ -147,7 +147,7 @@
 /* expected-error {{missing argument; expected 'enable', 'full' or 'disable'}} */ #pragma clang loop unroll()
 /* expected-error {{missing argument; expected 'enable' or 'disable'}} */ #pragma clang loop distribute()
 
-/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop
+/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_initiation_interval or distribute}} */ #pragma clang loop
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop badkeyword
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop badkeyword(enable)
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop vectorize(enable) badkeyword(4)
Index: test/CodeGenCXX/pragma-pipeline.cpp

[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-21 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin updated this revision to Diff 179248.
alexey.lapshin added a comment.

put a better link to doc


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710

Files:
  include/clang/Basic/Attr.td
  include/clang/Basic/AttrDocs.td
  include/clang/Basic/DiagnosticParseKinds.td
  lib/CodeGen/CGLoopInfo.cpp
  lib/CodeGen/CGLoopInfo.h
  lib/Parse/ParsePragma.cpp
  lib/Sema/SemaStmtAttr.cpp
  test/CodeGenCXX/pragma-pipeline.cpp
  test/Parser/pragma-loop.cpp
  test/Parser/pragma-pipeline.cpp
  test/Parser/pragma-unroll-and-jam.cpp

Index: test/Parser/pragma-unroll-and-jam.cpp
===
--- test/Parser/pragma-unroll-and-jam.cpp
+++ test/Parser/pragma-unroll-and-jam.cpp
@@ -67,7 +67,7 @@
   }
 
 // pragma clang unroll_and_jam is disabled for the moment
-/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop unroll_and_jam(4)
+/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_initiation_interval or distribute}} */ #pragma clang loop unroll_and_jam(4)
   for (int i = 0; i < Length; i++) {
 for (int j = 0; j < Length; j++) {
   List[i * Length + j] = Value;
Index: test/Parser/pragma-pipeline.cpp
===
--- /dev/null
+++ test/Parser/pragma-pipeline.cpp
@@ -0,0 +1,47 @@
+// RUN: %clang_cc1 -std=c++11 -verify %s
+
+// Note that this puts the expected lines before the directives to work around
+// limitations in the -verify mode.
+
+void test(int *List, int Length, int Value) {
+  int i = 0;
+
+#pragma clang loop pipeline(disable)
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+#pragma clang loop pipeline_initiation_interval(10)
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline(disable
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(enable)
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline disable
+/* expected-error {{missing argument; expected an integer value}} */ #pragma clang loop pipeline_initiation_interval()
+/* expected-error {{use of undeclared identifier 'error'}} */ #pragma clang loop pipeline_initiation_interval(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline_initiation_interval 1 2
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline_initiation_interval(1
+/* expected-error {{invalid argument of type 'double'; expected an integer type}} */ #pragma clang loop pipeline_initiation_interval(1.0)
+/* expected-error {{invalid value '0'; must be positive}} */ #pragma clang loop pipeline_initiation_interval(0)
+  for (int i = 0; i < Length; i++) {
+for (int j = 0; j < Length; j++) {
+  List[i * Length + j] = Value;
+}
+  }
+
+#pragma clang loop pipeline(disable) 
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int j = Length;
+#pragma clang loop pipeline_initiation_interval(4)
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int k = Length;
+
+#pragma clang loop pipeline(disable)
+#pragma clang loop pipeline_initiation_interval(4) /* expected-error {{incompatible directives 'pipeline(disable)' and 'pipeline_initiation_interval(4)'}} */
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+#pragma clang loop pipeline(disable)
+/* expected-error {{expected statement}} */ }
Index: test/Parser/pragma-loop.cpp
===
--- test/Parser/pragma-loop.cpp
+++ test/Parser/pragma-loop.cpp
@@ -147,7 +147,7 @@
 /* expected-error {{missing argument; expected 'enable', 'full' or 'disable'}} */ #pragma clang loop unroll()
 /* expected-error {{missing argument; expected 'enable' or 'disable'}} */ #pragma clang loop distribute()
 
-/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop
+/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_initiation_interval or distribute}} */ #pragma clang loop
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop badkeyword
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop badkeyword(enable)
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop vectorize(enable) badkeyword(4)
Index: test/CodeGenCXX/pragma-pipeline.cpp
=

[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-21 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin marked 2 inline comments as done.
alexey.lapshin added a comment.

will correct all mistakes. please check explanations for the questions.




Comment at: include/clang/Basic/AttrDocs.td:2655
+  dependence violated. This constant interval(in cycles) between the start
+  of iterations called initiation interval. Cycles number of one iteration
+  of newly generated loop matches with Initiation Interval. For further

aaron.ballman wrote:
> is called the initiation interval.
> 
> I can't quite parse the second sentence, though. Is it trying to say that the 
> initiation interval is the number of cycles for a single iteration of the 
> optimized loop?
Not quite right : initiation interval is not a number of cycles for a single 
iteration of the optimized loop. But initiation interval matches with number of 
cycles for a single iteration of the optimized loop. Initiation interval is the 
number of cycles between next and previous iteration of original loop. New loop 
created so that single iteration of the optimized loop has the same number 
cycles as initiation interval( thus every new iteration of original loop 
started each time when new iteration of optimized loop started - difference 
between iterations is initiation interval). 

trying to rephrase : number of cycles for a single iteration of the optimized 
loop matches with number of cycles of initiation interval.



Comment at: include/clang/Basic/AttrDocs.td:2676
+  the software pipeliner to try the specified initiation interval.
+  If schedule found the resulting loop iteration would have specified
+  cycle count. The positive number greater than zero can be specified:

aaron.ballman wrote:
> I'm can't quite parse this sentence either. Is it trying to say that the 
> scheduler will try to find a loop iteration cycle count that most-closely 
> matches the specified initiation interval?
I wanted to say that pipeliner will try to find a schedule that exactly matches 
the specified initiation interval. And if schedule would be found then single 
iteration of the optimized loop would have exactly the same amount of cycles as 
initiation interval. trying to rephrase :

If schedule would be found then single iteration of the optimized loop would 
have exactly the same amount of cycles as initiation interval has. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-24 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin marked 2 inline comments as done.
alexey.lapshin added inline comments.



Comment at: include/clang/Basic/AttrDocs.td:2655
+  dependence violated. This constant interval(in cycles) between the start
+  of iterations called initiation interval. Cycles number of one iteration
+  of newly generated loop matches with Initiation Interval. For further

aaron.ballman wrote:
> alexey.lapshin wrote:
> > aaron.ballman wrote:
> > > is called the initiation interval.
> > > 
> > > I can't quite parse the second sentence, though. Is it trying to say that 
> > > the initiation interval is the number of cycles for a single iteration of 
> > > the optimized loop?
> > Not quite right : initiation interval is not a number of cycles for a 
> > single iteration of the optimized loop. But initiation interval matches 
> > with number of cycles for a single iteration of the optimized loop. 
> > Initiation interval is the number of cycles between next and previous 
> > iteration of original loop. New loop created so that single iteration of 
> > the optimized loop has the same number cycles as initiation interval( thus 
> > every new iteration of original loop started each time when new iteration 
> > of optimized loop started - difference between iterations is initiation 
> > interval). 
> > 
> > trying to rephrase : number of cycles for a single iteration of the 
> > optimized loop matches with number of cycles of initiation interval.
> I am still a bit fuzzy on the details (optimizations are not my area of 
> expertise). Is this an accurate what to rephrase it?
> 
> "The initiation interval is the number of cycles between two iterations of an 
> unoptimized loop. These pragmas cause a new. optimized loop to be created 
> such that a single iteration of the loop executes in the same number of 
> cycles as the initiation interval."
I will take it with small modifications.



Comment at: include/clang/Basic/AttrDocs.td:2676
+  the software pipeliner to try the specified initiation interval.
+  If schedule found the resulting loop iteration would have specified
+  cycle count. The positive number greater than zero can be specified:

aaron.ballman wrote:
> alexey.lapshin wrote:
> > aaron.ballman wrote:
> > > I'm can't quite parse this sentence either. Is it trying to say that the 
> > > scheduler will try to find a loop iteration cycle count that most-closely 
> > > matches the specified initiation interval?
> > I wanted to say that pipeliner will try to find a schedule that exactly 
> > matches the specified initiation interval. And if schedule would be found 
> > then single iteration of the optimized loop would have exactly the same 
> > amount of cycles as initiation interval. trying to rephrase :
> > 
> > If schedule would be found then single iteration of the optimized loop 
> > would have exactly the same amount of cycles as initiation interval has. 
> Ah, okay! So what happens if an exact match cannot be found? Does it behave 
> as though the pragma was not specified at all (as in, the pragma is ignored)?
yes. if schedule was not found then the loop would stay unchanged.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2018-12-24 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin updated this revision to Diff 179459.
alexey.lapshin added a comment.

please consider changes in AttrDocs.td


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710

Files:
  include/clang/Basic/Attr.td
  include/clang/Basic/AttrDocs.td
  include/clang/Basic/DiagnosticParseKinds.td
  lib/CodeGen/CGLoopInfo.cpp
  lib/CodeGen/CGLoopInfo.h
  lib/Parse/ParsePragma.cpp
  lib/Sema/SemaStmtAttr.cpp
  test/CodeGenCXX/pragma-pipeline.cpp
  test/Parser/pragma-loop.cpp
  test/Parser/pragma-pipeline.cpp
  test/Parser/pragma-unroll-and-jam.cpp

Index: test/Parser/pragma-unroll-and-jam.cpp
===
--- test/Parser/pragma-unroll-and-jam.cpp
+++ test/Parser/pragma-unroll-and-jam.cpp
@@ -67,7 +67,7 @@
   }
 
 // pragma clang unroll_and_jam is disabled for the moment
-/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop unroll_and_jam(4)
+/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_initiation_interval or distribute}} */ #pragma clang loop unroll_and_jam(4)
   for (int i = 0; i < Length; i++) {
 for (int j = 0; j < Length; j++) {
   List[i * Length + j] = Value;
Index: test/Parser/pragma-pipeline.cpp
===
--- /dev/null
+++ test/Parser/pragma-pipeline.cpp
@@ -0,0 +1,47 @@
+// RUN: %clang_cc1 -std=c++11 -verify %s
+
+// Note that this puts the expected lines before the directives to work around
+// limitations in the -verify mode.
+
+void test(int *List, int Length, int Value) {
+  int i = 0;
+
+#pragma clang loop pipeline(disable)
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+#pragma clang loop pipeline_initiation_interval(10)
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline(disable
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(enable)
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline disable
+/* expected-error {{missing argument; expected an integer value}} */ #pragma clang loop pipeline_initiation_interval()
+/* expected-error {{use of undeclared identifier 'error'}} */ #pragma clang loop pipeline_initiation_interval(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline_initiation_interval 1 2
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline_initiation_interval(1
+/* expected-error {{invalid argument of type 'double'; expected an integer type}} */ #pragma clang loop pipeline_initiation_interval(1.0)
+/* expected-error {{invalid value '0'; must be positive}} */ #pragma clang loop pipeline_initiation_interval(0)
+  for (int i = 0; i < Length; i++) {
+for (int j = 0; j < Length; j++) {
+  List[i * Length + j] = Value;
+}
+  }
+
+#pragma clang loop pipeline(disable) 
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int j = Length;
+#pragma clang loop pipeline_initiation_interval(4)
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int k = Length;
+
+#pragma clang loop pipeline(disable)
+#pragma clang loop pipeline_initiation_interval(4) /* expected-error {{incompatible directives 'pipeline(disable)' and 'pipeline_initiation_interval(4)'}} */
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+#pragma clang loop pipeline(disable)
+/* expected-error {{expected statement}} */ }
Index: test/Parser/pragma-loop.cpp
===
--- test/Parser/pragma-loop.cpp
+++ test/Parser/pragma-loop.cpp
@@ -147,7 +147,7 @@
 /* expected-error {{missing argument; expected 'enable', 'full' or 'disable'}} */ #pragma clang loop unroll()
 /* expected-error {{missing argument; expected 'enable' or 'disable'}} */ #pragma clang loop distribute()
 
-/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop
+/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_initiation_interval or distribute}} */ #pragma clang loop
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop badkeyword
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop badkeyword(enable)
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop vectorize(enable) badkeyword(4)
Index: test/CodeGenCXX/pragma-pipeline.cpp
===

[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2019-01-02 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin marked an inline comment as done.
alexey.lapshin added inline comments.



Comment at: include/clang/Basic/AttrDocs.td:2582
 specified for the subsequent loop. The directive allows vectorization,
-interleaving, and unrolling to be enabled or disabled. Vector width as well
-as interleave and unrolling count can be manually specified. See
-`language extensions
+interleaving, and unrolling to be enabled or disabled. Pipelining could be 
disabled.
+Vector width, interleave count, unrolling count, and the initiation interval 
for pipelining

aaron.ballman wrote:
> Can you combine the new sentence with the previous one? `The directive allows 
> vectorization, interleaving, pipelining, and unrolling to be enabled or 
> disabled.`
In that case it would change a meaning and become incorrect. Pipelining could 
only be disabled. It could not be enabled.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2019-01-02 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin updated this revision to Diff 179843.
alexey.lapshin added a comment.

Thank you. addressed all grammar things. please check it once more.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710

Files:
  include/clang/Basic/Attr.td
  include/clang/Basic/AttrDocs.td
  include/clang/Basic/DiagnosticParseKinds.td
  lib/CodeGen/CGLoopInfo.cpp
  lib/CodeGen/CGLoopInfo.h
  lib/Parse/ParsePragma.cpp
  lib/Sema/SemaStmtAttr.cpp
  test/CodeGenCXX/pragma-pipeline.cpp
  test/Parser/pragma-loop.cpp
  test/Parser/pragma-pipeline.cpp
  test/Parser/pragma-unroll-and-jam.cpp

Index: test/Parser/pragma-unroll-and-jam.cpp
===
--- test/Parser/pragma-unroll-and-jam.cpp
+++ test/Parser/pragma-unroll-and-jam.cpp
@@ -67,7 +67,7 @@
   }
 
 // pragma clang unroll_and_jam is disabled for the moment
-/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop unroll_and_jam(4)
+/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_initiation_interval or distribute}} */ #pragma clang loop unroll_and_jam(4)
   for (int i = 0; i < Length; i++) {
 for (int j = 0; j < Length; j++) {
   List[i * Length + j] = Value;
Index: test/Parser/pragma-pipeline.cpp
===
--- /dev/null
+++ test/Parser/pragma-pipeline.cpp
@@ -0,0 +1,47 @@
+// RUN: %clang_cc1 -std=c++11 -verify %s
+
+// Note that this puts the expected lines before the directives to work around
+// limitations in the -verify mode.
+
+void test(int *List, int Length, int Value) {
+  int i = 0;
+
+#pragma clang loop pipeline(disable)
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+#pragma clang loop pipeline_initiation_interval(10)
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline(disable
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(enable)
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline disable
+/* expected-error {{missing argument; expected an integer value}} */ #pragma clang loop pipeline_initiation_interval()
+/* expected-error {{use of undeclared identifier 'error'}} */ #pragma clang loop pipeline_initiation_interval(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline_initiation_interval 1 2
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline_initiation_interval(1
+/* expected-error {{invalid argument of type 'double'; expected an integer type}} */ #pragma clang loop pipeline_initiation_interval(1.0)
+/* expected-error {{invalid value '0'; must be positive}} */ #pragma clang loop pipeline_initiation_interval(0)
+  for (int i = 0; i < Length; i++) {
+for (int j = 0; j < Length; j++) {
+  List[i * Length + j] = Value;
+}
+  }
+
+#pragma clang loop pipeline(disable) 
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int j = Length;
+#pragma clang loop pipeline_initiation_interval(4)
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int k = Length;
+
+#pragma clang loop pipeline(disable)
+#pragma clang loop pipeline_initiation_interval(4) /* expected-error {{incompatible directives 'pipeline(disable)' and 'pipeline_initiation_interval(4)'}} */
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+#pragma clang loop pipeline(disable)
+/* expected-error {{expected statement}} */ }
Index: test/Parser/pragma-loop.cpp
===
--- test/Parser/pragma-loop.cpp
+++ test/Parser/pragma-loop.cpp
@@ -147,7 +147,7 @@
 /* expected-error {{missing argument; expected 'enable', 'full' or 'disable'}} */ #pragma clang loop unroll()
 /* expected-error {{missing argument; expected 'enable' or 'disable'}} */ #pragma clang loop distribute()
 
-/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop
+/* expected-error {{missing option; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_initiation_interval or distribute}} */ #pragma clang loop
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop badkeyword
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop badkeyword(enable)
 /* expected-error {{invalid option 'badkeyword'}} */ #pragma clang loop vectorize(enable) badkeyword(4)
Index: test/CodeGenCXX/pragma-pipeline.cpp
==

[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2019-01-02 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin updated this revision to Diff 179933.
alexey.lapshin added a comment.

Thank you. I updated doc, splitted Parser test, added Sema tests. Please check 
it once more.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710

Files:
  include/clang/Basic/Attr.td
  include/clang/Basic/AttrDocs.td
  include/clang/Basic/DiagnosticParseKinds.td
  lib/CodeGen/CGLoopInfo.cpp
  lib/CodeGen/CGLoopInfo.h
  lib/Parse/ParsePragma.cpp
  lib/Sema/SemaStmtAttr.cpp
  test/CodeGenCXX/pragma-pipeline.cpp
  test/Parser/pragma-loop.cpp
  test/Parser/pragma-pipeline.cpp
  test/Parser/pragma-unroll-and-jam.cpp
  test/Sema/pragma-pipeline.cpp

Index: test/Sema/pragma-pipeline.cpp
===
--- /dev/null
+++ test/Sema/pragma-pipeline.cpp
@@ -0,0 +1,34 @@
+// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
+
+#pragma clang loop pipeline(disable) /* expected-error {{expected unqualified-id}} */
+int main() {
+  for (int i = 0; i < 10; ++i)
+;
+}
+
+void test(int *List, int Length, int Value) {
+  int i = 0;
+
+/* expected-error {{invalid argument of type 'double'; expected an integer type}} */ #pragma clang loop pipeline_initiation_interval(1.0)
+/* expected-error {{invalid value '0'; must be positive}} */ #pragma clang loop pipeline_initiation_interval(0)
+/* expected-error {{invalid value '-1'; must be positive}} */ #pragma clang loop pipeline_initiation_interval(-1)
+  for (int i = 0; i < Length; i++) {
+for (int j = 0; j < Length; j++) {
+  List[i * Length + j] = Value;
+}
+  }
+
+#pragma clang loop pipeline(disable) 
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int j = Length;
+#pragma clang loop pipeline_initiation_interval(4)
+/* expected-error {{expected a for, while, or do-while loop to follow '#pragma clang loop'}} */ int k = Length;
+
+#pragma clang loop pipeline(disable)
+#pragma clang loop pipeline_initiation_interval(4) /* expected-error {{incompatible directives 'pipeline(disable)' and 'pipeline_initiation_interval(4)'}} */
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+#pragma clang loop pipeline(disable)
+/* expected-error {{expected statement}} */ }
+
Index: test/Parser/pragma-unroll-and-jam.cpp
===
--- test/Parser/pragma-unroll-and-jam.cpp
+++ test/Parser/pragma-unroll-and-jam.cpp
@@ -67,7 +67,7 @@
   }
 
 // pragma clang unroll_and_jam is disabled for the moment
-/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, or distribute}} */ #pragma clang loop unroll_and_jam(4)
+/* expected-error {{invalid option 'unroll_and_jam'; expected vectorize, vectorize_width, interleave, interleave_count, unroll, unroll_count, pipeline, pipeline_initiation_interval, or distribute}} */ #pragma clang loop unroll_and_jam(4)
   for (int i = 0; i < Length; i++) {
 for (int j = 0; j < Length; j++) {
   List[i * Length + j] = Value;
Index: test/Parser/pragma-pipeline.cpp
===
--- /dev/null
+++ test/Parser/pragma-pipeline.cpp
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -std=c++11 -verify %s
+
+// Note that this puts the expected lines before the directives to work around
+// limitations in the -verify mode.
+
+void test(int *List, int Length, int Value) {
+  int i = 0;
+
+#pragma clang loop pipeline(disable)
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+#pragma clang loop pipeline_initiation_interval(10)
+  for (int i = 0; i < Length; i++) {
+List[i] = Value;
+  }
+
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline(disable
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(enable)
+/* expected-error {{invalid argument; expected 'disable'}} */ #pragma clang loop pipeline(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline disable
+/* expected-error {{missing argument; expected an integer value}} */ #pragma clang loop pipeline_initiation_interval()
+/* expected-error {{use of undeclared identifier 'error'}} */ #pragma clang loop pipeline_initiation_interval(error)
+/* expected-error {{expected '('}} */ #pragma clang loop pipeline_initiation_interval 1 2
+/* expected-error {{expected ')'}} */ #pragma clang loop pipeline_initiation_interval(1
+  for (int i = 0; i < Length; i++) {
+for (int j = 0; j < Length; j++) {
+  List[i * Length + j] = Value;
+}
+  }
+
+}
Index: test/Parser/pragma-loop.cpp
===
--- test/Parser/pragma-loop.cpp
+++ test/Parser/pragma-loop.cpp
@@ -147,7 +147,7 @@
 /* expected-error {{missing argument; expected 'enable', 'full' or 'disable'}} */ #pragma clang loop unroll()
 /* expected-error {{missing argument; expected 'enable' o

[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2019-01-03 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin marked an inline comment as done.
alexey.lapshin added inline comments.



Comment at: test/Sema/pragma-pipeline.cpp:3
+
+#pragma clang loop pipeline(disable) /* expected-error {{expected 
unqualified-id}} */
+int main() {

aaron.ballman wrote:
> I think this error is pretty unfortunate -- it doesn't really help the user 
> to understand what's wrong with their code. Can it be improved?
Agreed, this does not look very informative. It surely can be improved. Though 
it looks like not related to my fix. My fix is to add two additional loop 
hints. That kind of diagnostic related to all loop hints "clang loop". I.e. all 
pragmas "clang loop" can be checked for various cases and for better 
diagnostic. It looks like separate task. But OK, I will check that case.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D55710: add pragmas to control Software Pipelining optimisation

2019-01-03 Thread Alexey Lapshin via Phabricator via cfe-commits
alexey.lapshin added a comment.

Thank you!  Aaron, Could you integrate this patch, please? I do not have commit 
access yet.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55710/new/

https://reviews.llvm.org/D55710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-02 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 275123.
avl added a comment.

rebased.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement(&temp);
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/basic.ll
===
--- llvm/test/Transforms/TailCallElim/basic.ll
+++ llvm/test/Transforms/TailCallElim/basic.ll
@@ -19,8 +19,8 @@
 	%A = alloca i32		;  [#uses=2]
 	store i32 5, i32* %A
 	call void @use(i32* %A)
-; CHECK: tail call i32 @test1
-	%X = tail call i32 @test1()		;  [#uses=1]
+; CHECK: call i32 @test1
+	%X = call i32 @test1()		;  [#uses=1]
 	ret i32 %X
 }
 
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -185,11 +185,9 @@
 };
 }
 
-static bool markTails(Function &F, bool &AllCallsAreTailCalls,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function &F, OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +270,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) {
+  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI))
 DeferredTails.push_back(CI);
-  } else {
-AllCallsAreTailCalls = false;
-  }
 }
 
 for (auto *SuccBB : make_range(succ_begin(BB), succ_end(BB))) {
@@ -313,8 +308,6 @@
   LLVM_DEBUG(dbgs() << "Marked as tail call candidate: " << *CI << "\n");
   CI->setTailCall();
   Modified = true;
-} else {
-  AllCallsAreTailCalls = false;
 }
   }
 
@@ -326,6 +319,14 @@
 /// instructions between the call and this instruction are movable.
 ///
 static bool canMoveAboveCall(Instruction *I, CallInst *CI, AliasAnalysis *AA) {
+  if (isa(I))
+return true;
+
+  if (const IntrinsicInst *II = dyn_cast(I))
+if (II->getIntrinsicID() == Intrinsic::lifetime_end ||
+II->getIntrinsic

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-02 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a comment.

@efriedma What do you think about current state of this patch? Is it OK?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-08 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked 3 inline comments as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:94
 /// If it contains any dynamic allocas, returns false.
 static bool canTRE(Function &F) {
   // Because of PR962, we don't TRE dynamic allocas.

efriedma wrote:
> If we're not going to try to do TRE at all on calls not marked "tail", we can 
> probably drop this check.
It looks to me that original idea(PR962) was to avoid inefficient code which is 
generated for dynamic alloca. 

Currently there would still be generated inefficient code:

Doing TRE for dynamic alloca requires correct stack adjustment to avoid extra 
stack usage. 
i.e. dynamic stack reservation done for alloca should be restored 
in the end of the current iteration. Current TRE implementation does not do 
this.

Please, consider the test case:


```
#include 

int count;
__attribute__((noinline)) void globalIncrement(const int* param) { 
count += *param; 
}

void test(int recurseCount)
{
if (recurseCount == 0) return;
{
int *temp = (int*)alloca(100);
globalIncrement(temp);
}
test(recurseCount - 1);
}


```
Following is the x86 asm generated for the above test case in assumption that 
dynamic allocas are possible:

```

.LBB1_2:
movq%rsp, %rdi
addq$-112, %rdi   << dynamic stack reservation, need to 
be restored before "jne .LBB1_2"
movq%rdi, %rsp
callq   _Z15globalIncrementPKi
addl$-1, %ebx
jne .LBB1_2
```

So, it looks like we still have inefficient code here and it was a reason for 
avoiding TRE.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:808
   // Until this is resolved, disable this transformation if that would ever
   // happen.  This bug is PR962.
   for (Function::iterator BBI = F.begin(), E = F.end(); BBI != E; /*in loop*/) 
{

efriedma wrote:
> Can you move this FIXME into a more appropriate spot?
OK.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:335
+II->getIntrinsicID() == Intrinsic::assume)
+  return true;
+

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > What is the new handling for lifetime.end/assume doing?
> > They are just skipped. In following test case:
> > 
> > 
> > ```
> >   call void @_Z5test5i(i32 %sub)
> >   call void @llvm.lifetime.end.p0i8(i64 24, i8* nonnull %1) #5
> >   call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #5
> >   br label %return
> > 
> > ```
> > 
> > they are generated in between call and ret. It is safe to ignore them while 
> > checking whether transformation is possible.
> It makes sense we can ignore lifetime.end on an alloca: we know the call 
> doesn't refer to the alloca.  (Maybe we should check that the pointer 
> argument is pointing at an alloca?  That should usually be true anyway, but 
> better to be on the safe side, I guess.)
> 
> I don't think it's safe to hoist assume without additional checks; I think 
> we'd need to check that the call is marked "willreturn"?
> 
> Since this is sort of tricky, I'd prefer to split this off into a followup.
>It makes sense we can ignore lifetime.end on an alloca: we know the call 
>doesn't refer to the alloca. (Maybe we should check that the pointer argument 
>is pointing at an alloca? That should usually be true anyway, but better to be 
>on the safe side, I guess.)

OK, I would add checking that the pointer argument of lifetime.end is pointing 
to an alloca.

>I don't think it's safe to hoist assume without additional checks; I think 
>we'd need to check that the call is marked "willreturn"?

>Since this is sort of tricky, I'd prefer to split this off into a followup.

Ok, I would split Intrinsic::assume into another review.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-08 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 276591.
avl added a comment.

addressed comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement(&temp);
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/basic.ll
===
--- llvm/test/Transforms/TailCallElim/basic.ll
+++ llvm/test/Transforms/TailCallElim/basic.ll
@@ -19,8 +19,8 @@
 	%A = alloca i32		;  [#uses=2]
 	store i32 5, i32* %A
 	call void @use(i32* %A)
-; CHECK: tail call i32 @test1
-	%X = tail call i32 @test1()		;  [#uses=1]
+; CHECK: call i32 @test1
+	%X = call i32 @test1()		;  [#uses=1]
 	ret i32 %X
 }
 
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -81,6 +81,7 @@
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/Transforms/Scalar.h"
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
+#include "llvm/Transforms/Utils/Local.h"
 using namespace llvm;
 
 #define DEBUG_TYPE "tailcallelim"
@@ -92,7 +93,10 @@
 /// Scan the specified function for alloca instructions.
 /// If it contains any dynamic allocas, returns false.
 static bool canTRE(Function &F) {
-  // Because of PR962, we don't TRE dynamic allocas.
+  // TODO: We don't do TRE if dynamic allocas are used.
+  // Dynamic allocas allocate stack space which should be
+  // deallocated before new iteration started. That is
+  // currently not implemented.
   return llvm::all_of(instructions(F), [](Instruction &I) {
 auto *AI = dyn_cast(&I);
 return !AI || AI->isStaticAlloca();
@@ -185,11 +189,9 @@
 };
 }
 
-static bool markTails(Function &F, bool &AllCallsAreTailCalls,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function &F, OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +274,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-08 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:474
+  // Operand Bundles or not marked as TailCall.
+  if (CI->isNoTailCall() || CI->hasOperandBundles() || !CI->isTailCall())
 return nullptr;

efriedma wrote:
> The hasOperandBundles() check looks completely new; is there some test for it?
> 
> The `isNoTailCall()` check is currently redundant; it isn't legal to write 
> "tail notail".  I guess it makes sense to guard against that, though.
>The hasOperandBundles() check looks completely new; is there some test for it?

it is not new. it is copied from 245 line. Now, when patch changed from its 
original state all above conditions could be changed just to :

if (!CI->isTailCall())

the test is Transforms/TailCallElim/deopt-bundle.ll

>The isNoTailCall() check is currently redundant; it isn't legal to write "tail 
>notail". I guess it makes sense to guard against that, though.

would add checking for that.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-09 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 276799.
avl added a comment.

addressed comments:

added test for multiple recursive calls, 
removed duplicated check for operand bundles, 
simplified and commented tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-multiple-exits.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,74 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement(&temp);
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+declare void @_Z15globalIncrementPKi(i32* nocapture readonly %param) #0
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack.
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+; CHECK-LABEL: @_Z4testi(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[TEMP:%.*]] = alloca i32, align 4
+; CHECK-NEXT:br label [[TAILRECURSE:%.*]]
+; CHECK:   tailrecurse:
+; CHECK-NEXT:[[RECURSECOUNT_TR:%.*]] = phi i32 [ [[RECURSECOUNT:%.*]], [[ENTRY:%.*]] ], [ [[SUB:%.*]], [[IF_END:%.*]] ]
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[RECURSECOUNT_TR]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[RETURN:%.*]], label [[IF_END]]
+; CHECK:   if.end:
+; CHECK-NEXT:[[TMP0:%.*]] = bitcast i32* [[TEMP]] to i8*
+; CHECK-NEXT:call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull [[TMP0]])
+; CHECK-NEXT:store i32 10, i32* [[TEMP]], align 4
+; CHECK-NEXT:call void @_Z15globalIncrementPKi(i32* nonnull [[TEMP]])
+; CHECK-NEXT:[[SUB]] = add nsw i32 [[RECURSECOUNT_TR]], -1
+; CHECK-NEXT:call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull [[TMP0]])
+; CHECK-NEXT:br label [[TAILRECURSE]]
+; CHECK:   return:
+; CHECK-NEXT:ret void
+;
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/tre-multiple-exits.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-multiple-exits.ll
@@ -0,0 +1,125 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; This test checks that TRE would be done for only one recursive call.
+; The test_multiple_exits function has three recursive calls.
+; First recursive call could not be eliminated because there is
+; escaped pointer to local variable. Second recursive call could
+; be eliminated. Thrid recursive call could not be eliminated since
+; this is not last call. Thus, test checks that TRE would be done
+; for only second recursive call.
+
+; IR for that test was generated from the following C++ source:
+;
+; void capture_arg (int*);
+; void test_multiple_exits (int param);
+;   if (param >= 0 && param < 10) {
+; int temp;
+; capture_arg(&temp);
+; // TRE could not be done because pointer to local
+; // variable "temp" is escaped.
+; test_multiple_exits(param + 1);
+;   } else if (param >=10 && param < 20) {
+; // TRE should be done.
+; t

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-09 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a comment.

Thank you, for the review.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-11 Thread Alexey Lapshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGf7907e9d223d: [TRE] allow TRE for non-capturing calls. 
(authored by avl).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-multiple-exits.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,74 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement(&temp);
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+declare void @_Z15globalIncrementPKi(i32* nocapture readonly %param) #0
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack.
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+; CHECK-LABEL: @_Z4testi(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[TEMP:%.*]] = alloca i32, align 4
+; CHECK-NEXT:br label [[TAILRECURSE:%.*]]
+; CHECK:   tailrecurse:
+; CHECK-NEXT:[[RECURSECOUNT_TR:%.*]] = phi i32 [ [[RECURSECOUNT:%.*]], [[ENTRY:%.*]] ], [ [[SUB:%.*]], [[IF_END:%.*]] ]
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[RECURSECOUNT_TR]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[RETURN:%.*]], label [[IF_END]]
+; CHECK:   if.end:
+; CHECK-NEXT:[[TMP0:%.*]] = bitcast i32* [[TEMP]] to i8*
+; CHECK-NEXT:call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull [[TMP0]])
+; CHECK-NEXT:store i32 10, i32* [[TEMP]], align 4
+; CHECK-NEXT:call void @_Z15globalIncrementPKi(i32* nonnull [[TEMP]])
+; CHECK-NEXT:[[SUB]] = add nsw i32 [[RECURSECOUNT_TR]], -1
+; CHECK-NEXT:call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull [[TMP0]])
+; CHECK-NEXT:br label [[TAILRECURSE]]
+; CHECK:   return:
+; CHECK-NEXT:ret void
+;
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/tre-multiple-exits.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-multiple-exits.ll
@@ -0,0 +1,125 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; This test checks that TRE would be done for only one recursive call.
+; The test_multiple_exits function has three recursive calls.
+; First recursive call could not be eliminated because there is
+; escaped pointer to local variable. Second recursive call could
+; be eliminated. Thrid recursive call could not be eliminated since
+; this is not last call. Thus, test checks that TRE would be done
+; for only second recursive call.
+
+; IR for that test was generated from the following C++ source:
+;
+; void capture_arg (int*);
+; void test_multiple_exits (int param);
+;   if (param >= 0 && param < 10) {
+; int temp;
+; capture_arg(&temp);
+; // TRE could not be done because pointer to local
+; // variable "temp" is escaped.
+; test_multiple_exits(param + 1);
+;   } else if (param >=10 && param < 20) {
+; // TRE should be done.
+; test_multiple_exits(param + 1);
+;   } e

[PATCH] D82085: [TRE] markTails marks call sites as tailcalls though some of them are not.

2020-06-18 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a comment.

> It's OK to set "tail" on any call that satisfies these requirements (from 
> https://llvm.org/docs/LangRef.html#call-instruction): "Both markers [tail and 
> musttail] imply that the callee does not access allocas from the caller. The 
> tail marker additionally implies that the callee does not access varargs from 
> the caller."



> "tail" does not mean that the call *must* be generated as a tail call. It 
> just means that it's safe to generate it as a tail call if it turns out to be 
> possible (e.g. if the compiler can prove that @noarg doesn't return, or if it 
> can prove that all the code after the call to @noarg has no effect, or so on).

Yes, that is understood: ""tail" does not mean that the call *must* be 
generated as a tail call".
I intended to make picture consistent: when markTails sees function body, it is 
evident that the first call is not a tail call. I agree that a compiler "can 
prove that @noarg doesn't return, or if it can prove that all the code after 
the call to @noarg has no effect" and this first call could become tail-call. 
Probably the suitable strategy, in this case, would be to recalculate marking 
when it is broken(instead of creating false positive marks). There are other 
cases(compiler could add function calls), which could break existing marking 
and then would be necessary to recalculate marking.

Having many false positive "tail" marks could create a confusing picture. 
I would describe the full problem which I am trying to solve.
(I assumed this patch would be the first one for that problem):

cat test.cpp

  int count;
  __attribute__((noinline)) void globalIncrement(const int* param) { count += 
*param; }
  
  void test(int recurseCount)
  {
  if (recurseCount == 0) return;
  {
int temp = 10;
globalIncrement(&temp);
  }
  test(recurseCount - 1);
  }

TRE is not done for that test case. There are two reasons for that: 
First is that AllocaDerivedValueTracker does not use the PointerMayBeCaptured 
interface, and it does not see that &temp is not escaped.
Second is that AllCallsAreTailCalls is used as a pre-requisite for making TRE.
So it requires that all calls would be marked as "tail". This looks too 
restricted.
Instead, it should check that "&temp" is not escaped in globalIncrement() and 
that "test" 
is a tail recursive call not using allocas. I think the confusion happened 
exactly because "tail" marking was done for all calls(not for the real 
tailcalls).

Thus I planned to do the following fixes:

1. cleanup "tail" marking.(this patch)
2. do not use "AllCallsAreTailCalls" as a pre-requisite for TRE.(this patch).
3. use PointerMayBeCaptured inside AllocaDerivedValueTracker.

What do you think about all of this?

It seems to me that it would be good to have consistent "tail" marking. 
But if it does not look like a good idea, I could continue to point 2. and 3.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] markTails marks call sites as tailcalls though some of them are not.

2020-06-18 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a comment.

> It makes sense to teach tail recursion elimination not to depend so heavily 
> on tail markings. But I don't think that implies we want to mess with the 
> markings themselves.

Ok. Thus I need to drop part of this patch related to more strict tail marking, 
and continue with part which changes pre-requisite for TRE.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-22 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 272378.
avl added a comment.

1. deleted code doing more strict tailcall marking.
2. left removal of "AllCallsAreTailCalls".
3. added check for non-capturing calls while tracking alloca.
4. re-titled the patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,356 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement(&temp);
+;test(recurseCount - 1);
+;}
+;
+;struct Counter
+;{
+;int count = 0;
+;
+;__attribute__((noinline)) void increment(const int param) { count += param;
+;}
+;virtual __attribute__((noinline)) void increment(const int* param) { count += *param; }
+;
+;__attribute__((noinline)) int getCount () const { return count; }
+;};
+;
+;void test2(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;Counter counter;
+;counter.increment(0);
+;test2(recurseCount - 1);
+;}
+;
+;void test3(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;Counter counter;
+;counter.increment(counter.getCount());
+;test3(recurseCount - 1);
+;}
+;
+;void test4(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;Counter counter;
+;counter.increment(&temp);
+;test4(recurseCount - 1);
+;}
+;
+;struct Counter2 : public Counter
+;{
+;virtual __attribute__((noinline)) void increment(const int* param) {
+;ptr = param;
+;count += *param;
+;}
+;
+;const int* ptr;
+;};
+;
+;void test5(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;Counter2 counter;
+;counter.increment(&temp);
+;test5(recurseCount - 1);
+;}
+;
+
+%struct.Counter = type <{ i32 (...)**, i32, [4 x i8] }>
+%struct.Counter2 = type { %struct.Counter.base, i32* }
+%struct.Counter.base = type <{ i32 (...)**, i32 }>
+
+$_ZN7Counter9incrementEi = comdat any
+
+$_ZNK7Counter8getCountEv = comdat any
+
+$_ZN7Counter9incrementEPKi = comdat any
+
+$_ZN8Counter29incrementEPKi = comdat any
+
+$_ZTV7Counter = comdat any
+
+$_ZTS7Counter = comdat any
+
+$_ZTI7Counter = comdat any
+
+$_ZTV8Counter2 = comdat any
+
+$_ZTS8Counter2 = comdat any
+
+$_ZTI8Counter2 = comdat any
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+@_ZTV7Counter = linkonce_odr dso_local unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* bitcast ({ i8*, i8* }* @_ZTI7Counter to i8*), i8* bitcast (void (%struct.Counter*, i32*)* @_ZN7Counter9incrementEPKi to i8*)] }, comdat, align 8
+@_ZTVN10__cxxabiv117__class_type_infoE = external dso_local global i8*
+@_ZTS7Counter = linkonce_odr dso_local constant [9 x i8] c"7Counter\00", comdat, align 1
+@_ZTI7Counter = linkonce_odr dso_local constant { i8*, i8* } { i8* bitcast (i8** getelementptr inbounds (i8*, i8** @_ZTVN10__cxxabiv117__class_type_infoE, i64 2) to i8*), i8* getelementptr inbounds ([9 x i8], [9 x i8]* @_ZTS7Counter, i32 0, i32 0) }, comdat, align 8
+@_ZTV8Counter2 = linkonce_odr dso_local unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* bitcast ({ i8*, i8*, i8* }* @_ZTI8Counter2 to i8*), i8* bitcast (void (%struct.Counter2*, i32*)* @_ZN8Counter29incrementEPKi to i8*)] }, comdat, align 8
+@_ZTVN10__cxxabiv120__si_class_type_infoE = external dso_local global i8*
+@_ZTS8Counter2 = linkonce_odr dso_local constant [10 x i8] c"8Counter2\00", comdat, align 1
+@_ZTI8Counter2 = linkonce_odr dso_local constant { i8*, i8*, i8* } { i8* bitcast (i8** getelementptr inbounds (i8*, i8** @_ZTVN10__cxxabiv120__si_class_type_infoE, i64 2) to i8*), i8* getelementptr inbounds ([10 x i8], [10 x i8]* @_ZTS8Counter2, i32 0, i32 0), i8* bitcast ({ i8*, i8* }* @_ZTI7Counter to i8*) }, comdat, align 8
+
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+;

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-22 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked 4 inline comments as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:130
+IsNocapture = true;
+  else if (Function *CalledFunction = CB.getCalledFunction()) {
+if (CalledFunction->getBasicBlockList().size() > 0 &&

efriedma wrote:
> Please don't add code to examine the callee; if we're not deducing nocapture 
> appropriately, we should fix that elsewhere.
Ok. 



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:335
+II->getIntrinsicID() == Intrinsic::assume)
+  return true;
+

efriedma wrote:
> What is the new handling for lifetime.end/assume doing?
They are just skipped. In following test case:


```
  call void @_Z5test5i(i32 %sub)
  call void @llvm.lifetime.end.p0i8(i64 24, i8* nonnull %1) #5
  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #5
  br label %return

```

they are generated in between call and ret. It is safe to ignore them while 
checking whether transformation is possible.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument &Arg : F.args()) {

efriedma wrote:
> Do you have to redo the AllocaDerivedValueTracker analysis?  Is it not enough 
> that the call you're trying to TRE is marked "tail"?
>Do you have to redo the AllocaDerivedValueTracker analysis?

AllocaDerivedValueTracker analysis(done in markTails) could be reused here. 
But marking, done in markTails(), looks like separate tasks. i.e. it is better 
to make TRE not depending on markTails(). There is a review for this - 
https://reviews.llvm.org/D60031
Thus such separation looks useful(To not reuse result of markTails but have it 
computed inplace).

> Is it not enough that the call you're trying to TRE is marked "tail"?

It is not enough that call which is subject to TRE is marked "Tail".
It also should be checked that other calls does not capture pointer to local 
stack: 

```
// do not do TRE if any pointer to local stack has escaped.
if (!Tracker.EscapePoints.empty())
   return false;

```




Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:852
+
+// Do not do TRE if exists recursive calls which are not last calls.
+if (!isTailBlock(CI->getParent()) ||

efriedma wrote:
> I thought we had some tests where we TRE in the presence of recursive calls, 
> like a simple recursive fibonacci.  Am I misunderstanding this?
right, there is a testcase for fibonacchi:

llvm/test/Transforms/TailCallElim/accum_recursion.ll:@test3_fib

areAllLastFuncCallsRecursive() checking works well for fibonacci testcase:


```
return fib(x-1)+fib(x-2);

```

Since, Last funcs call chain is : fib()->fib()->ret. 
That check should prevent from such cases:


```
return fib(x-1)+another_call()+fib(x-2);
```




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-22 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked 3 inline comments as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:823
+
+bool TailRecursionEliminator::canTRE(Function &F) {
+  // The local stack holds all alloca instructions and all byval arguments.

laytonio wrote:
> There is no need to pass the function here since its a member variable.
Ok.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument &Arg : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is it not 
> > > enough that the call you're trying to TRE is marked "tail"?
> > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > 
> > AllocaDerivedValueTracker analysis(done in markTails) could be reused here. 
> > But marking, done in markTails(), looks like separate tasks. i.e. it is 
> > better 
> > to make TRE not depending on markTails(). There is a review for this - 
> > https://reviews.llvm.org/D60031
> > Thus such separation looks useful(To not reuse result of markTails but have 
> > it computed inplace).
> > 
> > > Is it not enough that the call you're trying to TRE is marked "tail"?
> > 
> > It is not enough that call which is subject to TRE is marked "Tail".
> > It also should be checked that other calls does not capture pointer to 
> > local stack: 
> > 
> > ```
> > // do not do TRE if any pointer to local stack has escaped.
> > if (!Tracker.EscapePoints.empty())
> >return false;
> > 
> > ```
> > 
> > It is not enough that call which is subject to TRE is marked "Tail". It 
> > also should be checked that other calls does not capture pointer to local 
> > stack:
> 
> If there's an escaped pointer to the local stack, we wouldn't infer "tail" in 
> the first place, would we?
If function receives pointer to alloca then it would not be marked with "Tail". 
Then we do not have a possibility to understand whether this function receives 
pointer to alloca but does not capture it:

```
void test(int recurseCount)
{
if (recurseCount == 0) return;
int temp = 10;
globalIncrement(&temp);
test(recurseCount - 1);
}
```

test - marked with Tail.
globalIncrement - not marked with Tail. But TRE could be done since it does not 
capture pointer. But if it will capture the pointer then we could not do TRE. 
So we need to check !Tracker.EscapePoints.empty().






Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:852
+
+// Do not do TRE if exists recursive calls which are not last calls.
+if (!isTailBlock(CI->getParent()) ||

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > I thought we had some tests where we TRE in the presence of recursive 
> > > calls, like a simple recursive fibonacci.  Am I misunderstanding this?
> > right, there is a testcase for fibonacchi:
> > 
> > llvm/test/Transforms/TailCallElim/accum_recursion.ll:@test3_fib
> > 
> > areAllLastFuncCallsRecursive() checking works well for fibonacci testcase:
> > 
> > 
> > ```
> > return fib(x-1)+fib(x-2);
> > 
> > ```
> > 
> > Since, Last funcs call chain is : fib()->fib()->ret. 
> > That check should prevent from such cases:
> > 
> > 
> > ```
> > return fib(x-1)+another_call()+fib(x-2);
> > ```
> > 
> > 
> > That check should prevent from such cases: return 
> > fib(x-1)+another_call()+fib(x-2);
> 
> Why do we need to prevent this?
We do not. I misunderstood the canTransformAccumulatorRecursion(). That check 
could be removed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument &Arg : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is it 
> > > > > not enough that the call you're trying to TRE is marked "tail"?
> > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > 
> > > > AllocaDerivedValueTracker analysis(done in markTails) could be reused 
> > > > here. 
> > > > But marking, done in markTails(), looks like separate tasks. i.e. it is 
> > > > better 
> > > > to make TRE not depending on markTails(). There is a review for this - 
> > > > https://reviews.llvm.org/D60031
> > > > Thus such separation looks useful(To not reuse result of markTails but 
> > > > have it computed inplace).
> > > > 
> > > > > Is it not enough that the call you're trying to TRE is marked "tail"?
> > > > 
> > > > It is not enough that call which is subject to TRE is marked "Tail".
> > > > It also should be checked that other calls does not capture pointer to 
> > > > local stack: 
> > > > 
> > > > ```
> > > > // do not do TRE if any pointer to local stack has escaped.
> > > > if (!Tracker.EscapePoints.empty())
> > > >return false;
> > > > 
> > > > ```
> > > > 
> > > > It is not enough that call which is subject to TRE is marked "Tail". It 
> > > > also should be checked that other calls does not capture pointer to 
> > > > local stack:
> > > 
> > > If there's an escaped pointer to the local stack, we wouldn't infer 
> > > "tail" in the first place, would we?
> > If function receives pointer to alloca then it would not be marked with 
> > "Tail". Then we do not have a possibility to understand whether this 
> > function receives pointer to alloca but does not capture it:
> > 
> > ```
> > void test(int recurseCount)
> > {
> > if (recurseCount == 0) return;
> > int temp = 10;
> > globalIncrement(&temp);
> > test(recurseCount - 1);
> > }
> > ```
> > 
> > test - marked with Tail.
> > globalIncrement - not marked with Tail. But TRE could be done since it does 
> > not capture pointer. But if it will capture the pointer then we could not 
> > do TRE. So we need to check !Tracker.EscapePoints.empty().
> > 
> > 
> > 
> > test - marked with Tail.
> 
> For the given code, TRE won't mark the recursive call "tail".  That transform 
> isn't legal: the recursive call could access the caller's version of "temp".
>For the given code, TRE won't mark the recursive call "tail". That transform 
>isn't legal: the recursive call could access the caller's version of "temp".

it looks like recursive call could NOT access the caller's version of "temp":

```
test(recurseCount - 1);
```

Caller`s version of temp is accessed by non-recursive call:

```
globalIncrement(&temp);
```

If globalIncrement does not capture the "&temp" then TRE looks to be legal for 
that case. 

globalIncrement() would not be marked with "Tail". test() would be marked with 
Tail.

Thus the pre-requisite for TRE would be: tail-recursive call must not receive 
pointer to local stack(Tail) and non-recursive calls must not capture the 
pointer to local stack.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument &Arg : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is 
> > > > > > > it not enough that the call you're trying to TRE is marked "tail"?
> > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > 
> > > > > > AllocaDerivedValueTracker analysis(done in markTails) could be 
> > > > > > reused here. 
> > > > > > But marking, done in markTails(), looks like separate tasks. i.e. 
> > > > > > it is better 
> > > > > > to make TRE not depending on markTails(). There is a review for 
> > > > > > this - https://reviews.llvm.org/D60031
> > > > > > Thus such separation looks useful(To not reuse result of markTails 
> > > > > > but have it computed inplace).
> > > > > > 
> > > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > > "tail"?
> > > > > > 
> > > > > > It is not enough that call which is subject to TRE is marked "Tail".
> > > > > > It also should be checked that other calls does not capture pointer 
> > > > > > to local stack: 
> > > > > > 
> > > > > > ```
> > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > if (!Tracker.EscapePoints.empty())
> > > > > >return false;
> > > > > > 
> > > > > > ```
> > > > > > 
> > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > "Tail". It also should be checked that other calls does not capture 
> > > > > > pointer to local stack:
> > > > > 
> > > > > If there's an escaped pointer to the local stack, we wouldn't infer 
> > > > > "tail" in the first place, would we?
> > > > If function receives pointer to alloca then it would not be marked with 
> > > > "Tail". Then we do not have a possibility to understand whether this 
> > > > function receives pointer to alloca but does not capture it:
> > > > 
> > > > ```
> > > > void test(int recurseCount)
> > > > {
> > > > if (recurseCount == 0) return;
> > > > int temp = 10;
> > > > globalIncrement(&temp);
> > > > test(recurseCount - 1);
> > > > }
> > > > ```
> > > > 
> > > > test - marked with Tail.
> > > > globalIncrement - not marked with Tail. But TRE could be done since it 
> > > > does not capture pointer. But if it will capture the pointer then we 
> > > > could not do TRE. So we need to check !Tracker.EscapePoints.empty().
> > > > 
> > > > 
> > > > 
> > > > test - marked with Tail.
> > > 
> > > For the given code, TRE won't mark the recursive call "tail".  That 
> > > transform isn't legal: the recursive call could access the caller's 
> > > version of "temp".
> > >For the given code, TRE won't mark the recursive call "tail". That 
> > >transform isn't legal: the recursive call could access the caller's 
> > >version of "temp".
> > 
> > it looks like recursive call could NOT access the caller's version of 
> > "temp":
> > 
> > ```
> > test(recurseCount - 1);
> > ```
> > 
> > Caller`s version of temp is accessed by non-recursive call:
> > 
> > ```
> > globalIncrement(&temp);
> > ```
> > 
> > If globalIncrement does not capture the "&temp" then TRE looks to be legal 
> > for that case. 
> > 
> > globalIncrement() would not be marked with "Tail". test() would be marked 
> > with Tail.
> > 
> > Thus the pre-requisite for TRE would be: tail-recursive call must not 
> > receive pointer to local stack(Tail) and non-recursive calls must not 
> > capture the pointer to local stack.
> Can you give a complete IR example where we infer "tail", but TRE is illegal?
> 
> Can you give a complete IR example, we we don't infer "tail", but we still do 
> the TRE transform here?
>Can you give a complete IR example where we infer "tail", but TRE is illegal?

there is no such example. Currently all cases where we  infer "tail" would be 
valid for TRE.

>Can you give a complete IR example, we we don't infer "tail", but we still do 
>the TRE transform here?

For the following example current code base would not infer "tail" for 
_Z15globalIncrementPKi and as the result would not do TRE for _Z4testi.
This patch changes this behavior: so that if _Z15globalIncrementPKi is not 
marked with "tail" and does not capture its pointer argument - TRE would be 
allowed for _Z4testi.


```
@count = dso_local local_unnamed_addr global i32 0, align 4

; Function Attrs: nofree noinline norecurse nounwind uwtable
define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) 
local_unnamed_addr #0 {
entry:
  %0 = load i32, i32* %param, align 4
  %1 = load i32, i32* @count, align 4
  %add = add nsw i32 %1, %0
  store i32 %add, i32* @count,

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument &Arg : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > avl wrote:
> > > > > > > > efriedma wrote:
> > > > > > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  
> > > > > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > > > > "tail"?
> > > > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > > > 
> > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) could be 
> > > > > > > > reused here. 
> > > > > > > > But marking, done in markTails(), looks like separate tasks. 
> > > > > > > > i.e. it is better 
> > > > > > > > to make TRE not depending on markTails(). There is a review for 
> > > > > > > > this - https://reviews.llvm.org/D60031
> > > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > > markTails but have it computed inplace).
> > > > > > > > 
> > > > > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > > > > "tail"?
> > > > > > > > 
> > > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > > "Tail".
> > > > > > > > It also should be checked that other calls does not capture 
> > > > > > > > pointer to local stack: 
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > >return false;
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > > "Tail". It also should be checked that other calls does not 
> > > > > > > > capture pointer to local stack:
> > > > > > > 
> > > > > > > If there's an escaped pointer to the local stack, we wouldn't 
> > > > > > > infer "tail" in the first place, would we?
> > > > > > If function receives pointer to alloca then it would not be marked 
> > > > > > with "Tail". Then we do not have a possibility to understand 
> > > > > > whether this function receives pointer to alloca but does not 
> > > > > > capture it:
> > > > > > 
> > > > > > ```
> > > > > > void test(int recurseCount)
> > > > > > {
> > > > > > if (recurseCount == 0) return;
> > > > > > int temp = 10;
> > > > > > globalIncrement(&temp);
> > > > > > test(recurseCount - 1);
> > > > > > }
> > > > > > ```
> > > > > > 
> > > > > > test - marked with Tail.
> > > > > > globalIncrement - not marked with Tail. But TRE could be done since 
> > > > > > it does not capture pointer. But if it will capture the pointer 
> > > > > > then we could not do TRE. So we need to check 
> > > > > > !Tracker.EscapePoints.empty().
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > test - marked with Tail.
> > > > > 
> > > > > For the given code, TRE won't mark the recursive call "tail".  That 
> > > > > transform isn't legal: the recursive call could access the caller's 
> > > > > version of "temp".
> > > > >For the given code, TRE won't mark the recursive call "tail". That 
> > > > >transform isn't legal: the recursive call could access the caller's 
> > > > >version of "temp".
> > > > 
> > > > it looks like recursive call could NOT access the caller's version of 
> > > > "temp":
> > > > 
> > > > ```
> > > > test(recurseCount - 1);
> > > > ```
> > > > 
> > > > Caller`s version of temp is accessed by non-recursive call:
> > > > 
> > > > ```
> > > > globalIncrement(&temp);
> > > > ```
> > > > 
> > > > If globalIncrement does not capture the "&temp" then TRE looks to be 
> > > > legal for that case. 
> > > > 
> > > > globalIncrement() would not be marked with "Tail". test() would be 
> > > > marked with Tail.
> > > > 
> > > > Thus the pre-requisite for TRE would be: tail-recursive call must not 
> > > > receive pointer to local stack(Tail) and non-recursive calls must not 
> > > > capture the pointer to local stack.
> > > Can you give a complete IR example where we infer "tail", but TRE is 
> > > illegal?
> > > 
> > > Can you give a complete IR example, we we don't infer "tail", but we 
> > > still do the TRE transform here?
> > >Can you give a complete IR example where we infer "tail", but TRE is 
> > >illegal?
> > 
> > there is no such example. Currently all cases where we  infer "tail" would 
> > be valid for TRE.
> > 
> > >Can you give a complete IR example, we we don't infer "tail", but we still 
> > >do the TRE transform here?
> > 
> > For the following example current code base would not infer "tail" for 
> > _Z15globalIncrementPKi and as the result would not do TRE for _Z4testi.
> > This patch chan

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 272829.
avl edited the summary of this revision.
avl added a comment.

addressed comments:

1. removed PointerMayBeCaptured() used for CalledFunction.
2. rewrote CanTRE() to visiting instructions only once.
3. replaced areAllLastFuncCallsRecursive() with isInTREPosition().

I did not address request for not using AllocaDerivedValueTracker yet.
Since there is open question on it. I would address it as soon as the question 
would be resolved.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement(&temp);
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -89,16 +89,6 @@
 STATISTIC(NumRetDuped,   "Number of return duplicated");
 STATISTIC(NumAccumAdded, "Number of accumulators introduced");
 
-/// Scan the specified function for alloca instructions.
-/// If it contains any dynamic allocas, returns false.
-static bool canTRE(Function &F) {
-  // Because of PR962, we don't TRE dynamic allocas.
-  return llvm::all_of(instructions(F), [](Instruction &I) {
-auto *AI = dyn_cast(&I);
-return !AI || AI->isStaticAlloca();
-  });
-}
-
 namespace {
 struct AllocaDerivedValueTracker {
   // Start at a root value and walk its use-def chain to mark calls that use the
@@ -185,11 +175,9 @@
 };
 }
 
-static bool markTails(Function &F, bool &AllCallsAreTailCalls,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function &F, OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +260,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) {
+  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI))
 DeferredTails.push_back(CI);
-  } else {
-AllCallsAreTailCalls = false;
-  }
 }
 
 for (auto *SuccBB : make_range(succ_begin(BB), succ_end(BB))) {
@@ -313,8 +298,6 @@
   LLVM_DEBUG(dbgs(

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument &Arg : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > avl wrote:
> > > > > > > > efriedma wrote:
> > > > > > > > > avl wrote:
> > > > > > > > > > efriedma wrote:
> > > > > > > > > > > Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > analysis?  Is it not enough that the call you're trying 
> > > > > > > > > > > to TRE is marked "tail"?
> > > > > > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > > > > > 
> > > > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) could 
> > > > > > > > > > be reused here. 
> > > > > > > > > > But marking, done in markTails(), looks like separate 
> > > > > > > > > > tasks. i.e. it is better 
> > > > > > > > > > to make TRE not depending on markTails(). There is a review 
> > > > > > > > > > for this - https://reviews.llvm.org/D60031
> > > > > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > > > > markTails but have it computed inplace).
> > > > > > > > > > 
> > > > > > > > > > > Is it not enough that the call you're trying to TRE is 
> > > > > > > > > > > marked "tail"?
> > > > > > > > > > 
> > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > marked "Tail".
> > > > > > > > > > It also should be checked that other calls does not capture 
> > > > > > > > > > pointer to local stack: 
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > > > >return false;
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > marked "Tail". It also should be checked that other calls 
> > > > > > > > > > does not capture pointer to local stack:
> > > > > > > > > 
> > > > > > > > > If there's an escaped pointer to the local stack, we wouldn't 
> > > > > > > > > infer "tail" in the first place, would we?
> > > > > > > > If function receives pointer to alloca then it would not be 
> > > > > > > > marked with "Tail". Then we do not have a possibility to 
> > > > > > > > understand whether this function receives pointer to alloca but 
> > > > > > > > does not capture it:
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > void test(int recurseCount)
> > > > > > > > {
> > > > > > > > if (recurseCount == 0) return;
> > > > > > > > int temp = 10;
> > > > > > > > globalIncrement(&temp);
> > > > > > > > test(recurseCount - 1);
> > > > > > > > }
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > test - marked with Tail.
> > > > > > > > globalIncrement - not marked with Tail. But TRE could be done 
> > > > > > > > since it does not capture pointer. But if it will capture the 
> > > > > > > > pointer then we could not do TRE. So we need to check 
> > > > > > > > !Tracker.EscapePoints.empty().
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > test - marked with Tail.
> > > > > > > 
> > > > > > > For the given code, TRE won't mark the recursive call "tail".  
> > > > > > > That transform isn't legal: the recursive call could access the 
> > > > > > > caller's version of "temp".
> > > > > > >For the given code, TRE won't mark the recursive call "tail". That 
> > > > > > >transform isn't legal: the recursive call could access the 
> > > > > > >caller's version of "temp".
> > > > > > 
> > > > > > it looks like recursive call could NOT access the caller's version 
> > > > > > of "temp":
> > > > > > 
> > > > > > ```
> > > > > > test(recurseCount - 1);
> > > > > > ```
> > > > > > 
> > > > > > Caller`s version of temp is accessed by non-recursive call:
> > > > > > 
> > > > > > ```
> > > > > > globalIncrement(&temp);
> > > > > > ```
> > > > > > 
> > > > > > If globalIncrement does not capture the "&temp" then TRE looks to 
> > > > > > be legal for that case. 
> > > > > > 
> > > > > > globalIncrement() would not be marked with "Tail". test() would be 
> > > > > > marked with Tail.
> > > > > > 
> > > > > > Thus the pre-requisite for TRE would be: tail-recursive call must 
> > > > > > not receive pointer to local stack(Tail) and non-recursive calls 
> > > > > > must not capture the pointer to local stack.
> > > > > Can you give a complete IR example where we infer "tail", but TRE is 
> > > > > illegal?
> > > > > 
> > > > > Can you give a complete IR example, we we don't infer "tail", but we 
> > > > > still do the TRE transform here?
> > > > >Can you give a complete IR example where

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-24 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument &Arg : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > avl wrote:
> > > > > > > > efriedma wrote:
> > > > > > > > > avl wrote:
> > > > > > > > > > efriedma wrote:
> > > > > > > > > > > avl wrote:
> > > > > > > > > > > > efriedma wrote:
> > > > > > > > > > > > > Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > > > analysis?  Is it not enough that the call you're 
> > > > > > > > > > > > > trying to TRE is marked "tail"?
> > > > > > > > > > > > >Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > > >analysis?
> > > > > > > > > > > > 
> > > > > > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) 
> > > > > > > > > > > > could be reused here. 
> > > > > > > > > > > > But marking, done in markTails(), looks like separate 
> > > > > > > > > > > > tasks. i.e. it is better 
> > > > > > > > > > > > to make TRE not depending on markTails(). There is a 
> > > > > > > > > > > > review for this - https://reviews.llvm.org/D60031
> > > > > > > > > > > > Thus such separation looks useful(To not reuse result 
> > > > > > > > > > > > of markTails but have it computed inplace).
> > > > > > > > > > > > 
> > > > > > > > > > > > > Is it not enough that the call you're trying to TRE 
> > > > > > > > > > > > > is marked "tail"?
> > > > > > > > > > > > 
> > > > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > > > marked "Tail".
> > > > > > > > > > > > It also should be checked that other calls does not 
> > > > > > > > > > > > capture pointer to local stack: 
> > > > > > > > > > > > 
> > > > > > > > > > > > ```
> > > > > > > > > > > > // do not do TRE if any pointer to local stack has 
> > > > > > > > > > > > escaped.
> > > > > > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > > > > > >return false;
> > > > > > > > > > > > 
> > > > > > > > > > > > ```
> > > > > > > > > > > > 
> > > > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > > > marked "Tail". It also should be checked that other 
> > > > > > > > > > > > calls does not capture pointer to local stack:
> > > > > > > > > > > 
> > > > > > > > > > > If there's an escaped pointer to the local stack, we 
> > > > > > > > > > > wouldn't infer "tail" in the first place, would we?
> > > > > > > > > > If function receives pointer to alloca then it would not be 
> > > > > > > > > > marked with "Tail". Then we do not have a possibility to 
> > > > > > > > > > understand whether this function receives pointer to alloca 
> > > > > > > > > > but does not capture it:
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > void test(int recurseCount)
> > > > > > > > > > {
> > > > > > > > > > if (recurseCount == 0) return;
> > > > > > > > > > int temp = 10;
> > > > > > > > > > globalIncrement(&temp);
> > > > > > > > > > test(recurseCount - 1);
> > > > > > > > > > }
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > test - marked with Tail.
> > > > > > > > > > globalIncrement - not marked with Tail. But TRE could be 
> > > > > > > > > > done since it does not capture pointer. But if it will 
> > > > > > > > > > capture the pointer then we could not do TRE. So we need to 
> > > > > > > > > > check !Tracker.EscapePoints.empty().
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > test - marked with Tail.
> > > > > > > > > 
> > > > > > > > > For the given code, TRE won't mark the recursive call "tail". 
> > > > > > > > >  That transform isn't legal: the recursive call could access 
> > > > > > > > > the caller's version of "temp".
> > > > > > > > >For the given code, TRE won't mark the recursive call "tail". 
> > > > > > > > >That transform isn't legal: the recursive call could access 
> > > > > > > > >the caller's version of "temp".
> > > > > > > > 
> > > > > > > > it looks like recursive call could NOT access the caller's 
> > > > > > > > version of "temp":
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > test(recurseCount - 1);
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > Caller`s version of temp is accessed by non-recursive call:
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > globalIncrement(&temp);
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > If globalIncrement does not capture the "&temp" then TRE looks 
> > > > > > > > to be legal for that case. 
> > > > > > > > 
> > > > > > > > globalIncrement() would not be marked with "Tail". test() would 
> > > > > > > > be marked with Tail.
> > > > > > > > 
> > > > > > > > Thus the pre

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-24 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 273029.
avl added a comment.

removed usages of AllocaDerivedValueTracker from canTRE().


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement(&temp);
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/basic.ll
===
--- llvm/test/Transforms/TailCallElim/basic.ll
+++ llvm/test/Transforms/TailCallElim/basic.ll
@@ -19,8 +19,8 @@
 	%A = alloca i32		;  [#uses=2]
 	store i32 5, i32* %A
 	call void @use(i32* %A)
-; CHECK: tail call i32 @test1
-	%X = tail call i32 @test1()		;  [#uses=1]
+; CHECK: call i32 @test1
+	%X = call i32 @test1()		;  [#uses=1]
 	ret i32 %X
 }
 
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -89,16 +89,6 @@
 STATISTIC(NumRetDuped,   "Number of return duplicated");
 STATISTIC(NumAccumAdded, "Number of accumulators introduced");
 
-/// Scan the specified function for alloca instructions.
-/// If it contains any dynamic allocas, returns false.
-static bool canTRE(Function &F) {
-  // Because of PR962, we don't TRE dynamic allocas.
-  return llvm::all_of(instructions(F), [](Instruction &I) {
-auto *AI = dyn_cast(&I);
-return !AI || AI->isStaticAlloca();
-  });
-}
-
 namespace {
 struct AllocaDerivedValueTracker {
   // Start at a root value and walk its use-def chain to mark calls that use the
@@ -185,11 +175,9 @@
 };
 }
 
-static bool markTails(Function &F, bool &AllCallsAreTailCalls,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function &F, OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +260,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) {
+  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI))
 DeferredTails.push_back(CI);
-  } else {
-Al

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-24 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:830
+  !CI->isTailCall())
+return false;
+}

laytonio wrote:
> Is this correct? I think we want to check these per TRE candidate in 
> findTRECandidate, not just disable TRE in general if one is found.
I tried to minimize changes and keep old logic here - but yes, it is better to 
move that check into findTRECandidate(). Will do.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-24 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 273174.
avl added a comment.

check valid TRE candidate into findTRECandidate()().


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement(&temp);
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/basic.ll
===
--- llvm/test/Transforms/TailCallElim/basic.ll
+++ llvm/test/Transforms/TailCallElim/basic.ll
@@ -19,8 +19,8 @@
 	%A = alloca i32		;  [#uses=2]
 	store i32 5, i32* %A
 	call void @use(i32* %A)
-; CHECK: tail call i32 @test1
-	%X = tail call i32 @test1()		;  [#uses=1]
+; CHECK: call i32 @test1
+	%X = call i32 @test1()		;  [#uses=1]
 	ret i32 %X
 }
 
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -89,16 +89,6 @@
 STATISTIC(NumRetDuped,   "Number of return duplicated");
 STATISTIC(NumAccumAdded, "Number of accumulators introduced");
 
-/// Scan the specified function for alloca instructions.
-/// If it contains any dynamic allocas, returns false.
-static bool canTRE(Function &F) {
-  // Because of PR962, we don't TRE dynamic allocas.
-  return llvm::all_of(instructions(F), [](Instruction &I) {
-auto *AI = dyn_cast(&I);
-return !AI || AI->isStaticAlloca();
-  });
-}
-
 namespace {
 struct AllocaDerivedValueTracker {
   // Start at a root value and walk its use-def chain to mark calls that use the
@@ -185,11 +175,9 @@
 };
 }
 
-static bool markTails(Function &F, bool &AllCallsAreTailCalls,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function &F, OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +260,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) {
+  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI))
 DeferredTails.push_back(CI);
-  } else {
-AllCalls

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-25 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:838
+if (isValidTRECandidate(CI))
+  HasValidCandidates = true;
+  }

laytonio wrote:
> Is there any reason to find and validate candidates now only to have to redo 
> it when we actually perform the eliminations? If so, is there any reason this 
> needs to follow a different code path than findTRECandidate? findTRECandidate 
> is doing the same checks, except for canMoveAboveCall and 
> canTransformAccumulatorRecursion, which should probably be refactored into 
> findTRECandidate from eliminateCall anyway. If not then all of this code goes 
> away and we're left with the same canTRE as in trunk.
We are enumerating all instructions here, so we could understand if there are 
not TRE candidates and stop earlier. That is the reason for doing it here.

I agree that findTRECandidate should be refactored to have the same checks as 
here. 

What do you think is better to do:

- leave early check for TRE candidates in canTRE or remove it
- refactor findTRECandidate or leave it as is

?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-25 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:838
+if (isValidTRECandidate(CI))
+  HasValidCandidates = true;
+  }

laytonio wrote:
> avl wrote:
> > laytonio wrote:
> > > Is there any reason to find and validate candidates now only to have to 
> > > redo it when we actually perform the eliminations? If so, is there any 
> > > reason this needs to follow a different code path than findTRECandidate? 
> > > findTRECandidate is doing the same checks, except for canMoveAboveCall 
> > > and canTransformAccumulatorRecursion, which should probably be refactored 
> > > into findTRECandidate from eliminateCall anyway. If not then all of this 
> > > code goes away and we're left with the same canTRE as in trunk.
> > We are enumerating all instructions here, so we could understand if there 
> > are not TRE candidates and stop earlier. That is the reason for doing it 
> > here.
> > 
> > I agree that findTRECandidate should be refactored to have the same checks 
> > as here. 
> > 
> > What do you think is better to do:
> > 
> > - leave early check for TRE candidates in canTRE or remove it
> > - refactor findTRECandidate or leave it as is
> > 
> > ?
> Yes we are iterating all the instructions here but, unless I am missing 
> something, we would literally just be doing the checks twice for no reason. 
> Look at it this way, best case scenario we have to check all possible 
> candidates once, find none and we're done. Worst case, we check all possible 
> candidates once, find one and have to check all possible candidates a second 
> time. Where as if we remove the early checks we only ever have to check the 
> candidates once. So we wouldn't really be stopping any earlier.
> 
> As for refactoring findTRECandidate, I do think that should be done and we 
> should strive to move all the failure conditions out of eliminateCall in 
> order to avoid having to fold a return only to find out we didn't need to. 
> But, I think that is out of the scope of this change, and if we do decide to 
> keep the early checks here then we should say that findTRECandidate does a 
> good enough job to consider this function as having valid candidates.
>Yes we are iterating all the instructions here but, unless I am missing 
>something, we would literally just be doing the checks twice for no reason. 
>Look at it this way, best case scenario we have to check all possible 
>candidates once, find none and we're done. Worst case, we check all possible 
>candidates once, find one and have to check all possible candidates a second 
>time. Where as if we remove the early checks we only ever have to check the 
>candidates once. So we wouldn't really be stopping any earlier.

yes. we would do check twice if there are TRE candidates. my idea was that 
number of cases when TRE is applicable less then when TRE is not applicable. 
Thus we would do double instruction navigation more often than double check for 
candidates. But, I did not measure real impact. Thus, let`s return old logic 
here as you suggested.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-25 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 273457.
avl added a comment.

removed early check for TRE candidates from canTRE().


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement(&temp);
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/basic.ll
===
--- llvm/test/Transforms/TailCallElim/basic.ll
+++ llvm/test/Transforms/TailCallElim/basic.ll
@@ -19,8 +19,8 @@
 	%A = alloca i32		;  [#uses=2]
 	store i32 5, i32* %A
 	call void @use(i32* %A)
-; CHECK: tail call i32 @test1
-	%X = tail call i32 @test1()		;  [#uses=1]
+; CHECK: call i32 @test1
+	%X = call i32 @test1()		;  [#uses=1]
 	ret i32 %X
 }
 
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -185,11 +185,9 @@
 };
 }
 
-static bool markTails(Function &F, bool &AllCallsAreTailCalls,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function &F, OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +270,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) {
+  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI))
 DeferredTails.push_back(CI);
-  } else {
-AllCallsAreTailCalls = false;
-  }
 }
 
 for (auto *SuccBB : make_range(succ_begin(BB), succ_end(BB))) {
@@ -313,8 +308,6 @@
   LLVM_DEBUG(dbgs() << "Marked as tail call candidate: " << *CI << "\n");
   CI->setTailCall();
   Modified = true;
-} else {
-  AllCallsAreTailCalls = false;
 }
   }
 
@@ -326,6 +319,14 @@
 /// instructions between the call and this instruction are movable.
 ///
 static bool canMoveAboveCall(Instruction *I, CallInst *CI, AliasAnalysis *AA) {
+  if (isa(I))
+return true;
+
+  if (const IntrinsicInst *II = dyn_cast(I))
+if (II->getIntrinsicID() == Intrins

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-29 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a comment.

ping.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D91844: [llvm][clang] Add checks for the smart pointers with the possibility to be null

2020-11-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a comment.

The change for DWARFLinker is fine. I also think it is better to split the 
patch up into separate patches.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91844/new/

https://reviews.llvm.org/D91844

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D153741: [Tooling][Rewriter] Remove the redundant AtomicallyMovedFile Implementation.

2023-07-01 Thread Alexey Lapshin via Phabricator via cfe-commits
avl accepted this revision.
avl added a comment.
This revision is now accepted and ready to land.

LGTM. please pay attention to the clang format issues reported during pre-merge 
build.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153741/new/

https://reviews.llvm.org/D153741

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154329: [lldb] Replace llvm::writeFileAtomically with llvm::writeToOutput API.

2023-07-03 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added inline comments.



Comment at: lldb/tools/lldb-server/lldb-platform.cpp:112
 return Status("Failed to atomically write file %s",
   file_spec.GetPath().c_str());
   return status;

probably, it would be better to add error text here?

```
return Status("Failed to atomically write file %s: %s",
  file_spec.GetPath().c_str(), 
toString(std::move(Err)).c_str());
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154329/new/

https://reviews.llvm.org/D154329

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154329: [lldb] Replace llvm::writeFileAtomically with llvm::writeToOutput API.

2023-07-03 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a comment.

this LGTM. thanks! Please, wait for approve from Jonas. I think someone from 
lldb needs to check whether new error reporting is OK.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154329/new/

https://reviews.llvm.org/D154329

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D153740: [llvm][Support] Deprecate llvm::writeFileAtomically API

2023-07-03 Thread Alexey Lapshin via Phabricator via cfe-commits
avl accepted this revision.
avl added a comment.
This revision is now accepted and ready to land.

this LGTM, assuming D154329  is landed. Thank 
you!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153740/new/

https://reviews.llvm.org/D153740

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71304: [DWARF5][SplitDwarf] Set default state for -fsplit-dwarf-inlining to be false.

2019-12-11 Thread Alexey Lapshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG21bc8958668a: [DWARF5][SplitDwarf] Set default state for 
-fsplit-dwarf-inlining to be false. (authored by avl).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71304/new/

https://reviews.llvm.org/D71304

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/split-debug.c


Index: clang/test/Driver/split-debug.c
===
--- clang/test/Driver/split-debug.c
+++ clang/test/Driver/split-debug.c
@@ -61,6 +61,9 @@
 // CHECK-GMLT-WITH-SPLIT: "-split-dwarf-file"
 // CHECK-GMLT-WITH-SPLIT: "-split-dwarf-output"
 
+// RUN: %clang -target x86_64-unknown-linux-gnu -g -S -### %s 2> %t
+// RUN: FileCheck -check-prefix=CHECK-NOINLINE-WITHOUT-SPLIT < %t %s
+//
 // RUN: %clang -target x86_64-unknown-linux-gnu -g -fno-split-dwarf-inlining 
-S -### %s 2> %t
 // RUN: FileCheck -check-prefix=CHECK-NOINLINE-WITHOUT-SPLIT < %t %s
 //
@@ -79,7 +82,7 @@
 // CHECK-SPLIT-WITH-NOINL: "-debug-info-kind=limited"
 // CHECK-SPLIT-WITH-NOINL: "-split-dwarf-output"
 
-// RUN: %clang -target x86_64-unknown-linux-gnu -gsplit-dwarf -gmlt -S -### %s 
2> %t
+// RUN: %clang -target x86_64-unknown-linux-gnu -gsplit-dwarf -gmlt 
-fsplit-dwarf-inlining -S -### %s 2> %t
 // RUN: FileCheck -check-prefix=CHECK-GMLT-OVER-SPLIT < %t %s
 //
 // CHECK-GMLT-OVER-SPLIT: "-debug-info-kind=line-tables-only"
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -3539,7 +3539,7 @@
 
   bool SplitDWARFInlining =
   Args.hasFlag(options::OPT_fsplit_dwarf_inlining,
-   options::OPT_fno_split_dwarf_inlining, true);
+   options::OPT_fno_split_dwarf_inlining, false);
 
   Args.ClaimAllArgs(options::OPT_g_Group);
 


Index: clang/test/Driver/split-debug.c
===
--- clang/test/Driver/split-debug.c
+++ clang/test/Driver/split-debug.c
@@ -61,6 +61,9 @@
 // CHECK-GMLT-WITH-SPLIT: "-split-dwarf-file"
 // CHECK-GMLT-WITH-SPLIT: "-split-dwarf-output"
 
+// RUN: %clang -target x86_64-unknown-linux-gnu -g -S -### %s 2> %t
+// RUN: FileCheck -check-prefix=CHECK-NOINLINE-WITHOUT-SPLIT < %t %s
+//
 // RUN: %clang -target x86_64-unknown-linux-gnu -g -fno-split-dwarf-inlining -S -### %s 2> %t
 // RUN: FileCheck -check-prefix=CHECK-NOINLINE-WITHOUT-SPLIT < %t %s
 //
@@ -79,7 +82,7 @@
 // CHECK-SPLIT-WITH-NOINL: "-debug-info-kind=limited"
 // CHECK-SPLIT-WITH-NOINL: "-split-dwarf-output"
 
-// RUN: %clang -target x86_64-unknown-linux-gnu -gsplit-dwarf -gmlt -S -### %s 2> %t
+// RUN: %clang -target x86_64-unknown-linux-gnu -gsplit-dwarf -gmlt -fsplit-dwarf-inlining -S -### %s 2> %t
 // RUN: FileCheck -check-prefix=CHECK-GMLT-OVER-SPLIT < %t %s
 //
 // CHECK-GMLT-OVER-SPLIT: "-debug-info-kind=line-tables-only"
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -3539,7 +3539,7 @@
 
   bool SplitDWARFInlining =
   Args.hasFlag(options::OPT_fsplit_dwarf_inlining,
-   options::OPT_fno_split_dwarf_inlining, true);
+   options::OPT_fno_split_dwarf_inlining, false);
 
   Args.ClaimAllArgs(options::OPT_g_Group);
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D153652: [llvm][Support] Don'tt set "all_exe" mode by default for file written by llvm::writeToOutput.

2023-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a reviewer: jhenderson.
avl added a comment.

Agree that it is safe to remove all_exec from the default permissions. There 
are currently three usages of writeToOutput API: clang include cleaner, 
llvm-objcopy, llvm-dwarfutil. Both llvm-objcopy and llvm-dwarfutil use 
FilePermissionsApplier which should apply all_exec permission if necessary. 
Using all_exec by default looks redundant. It was introduced by 
https://reviews.llvm.org/D98426

please add the test for that case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153652/new/

https://reviews.llvm.org/D153652

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D153652: [Support] Don't set "all_exe" mode by default for file written by llvm::writeToOutput

2023-06-26 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a comment.

In D153652#4448157 , @hokein wrote:

> Thanks for the comments.
>
> In D153652#4447976 , @jhenderson 
> wrote:
>
>> Is there anything that can be done to use gtest unit tests for this? The two 
>> lit tests are useful, but the problem with them is that if they switch to 
>> using a different approach to emitting their data, the lit tests won't cover 
>> the Support code you're actually changing.
>
> The usages of `llvm::writeOutput` are in the ToolMain.cpp files, I don't see 
> a way to unittest them.

unit test which checks `llvm::writeOutput` may be added to 
raw_ostream_test.cpp. Let`s this test check the case when "all_exec" is not 
set. i.e. that after default usage of `llvm::writeOutput` the resulting file 
does not have "all_exec", which is exact purpose of this patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153652/new/

https://reviews.llvm.org/D153652

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D153740: [llvm][Support] Deprecate llvm::writeFileAtomically API

2023-06-26 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a reviewer: jkorous.
avl added a subscriber: jkorous.
avl added a comment.

added @jkorous who originally added llvm::writeFileAtomically.

> Let me know what you think about it -- I considered keeping the 
> llvm::writeFileAtomically and migrating its underlying implementation to 
> llvm::writeToOutput, but it doesn't seem to worth, there are only 4 in-tree 
> usages of this API, I think it is probably better just remove it.

I think it is OK to leave only one API (f.e. llvm::writeToOutput) and remove 
another. It would probably be better to split this patch to  lldb, thinlto, 
clang and removal parts.




Comment at: lldb/tools/lldb-server/lldb-platform.cpp:107
   Status status;
-  if (auto Err =
-  handleErrors(llvm::writeFileAtomically(
-   temp_file_path, file_spec.GetPath(), socket_id),
-   [&status, &file_spec](const AtomicFileWriteError &E) {
- std::string ErrorMsgBuffer;
- llvm::raw_string_ostream S(ErrorMsgBuffer);
- E.log(S);
-
- switch (E.Error) {
- case atomic_write_error::failed_to_create_uniq_file:
-   status = Status("Failed to create temp file: %s",
-   ErrorMsgBuffer.c_str());
-   break;
- case atomic_write_error::output_stream_error:
-   status = Status("Failed to write to port file.");
-   break;
- case atomic_write_error::failed_to_rename_temp_file:
-   status = Status("Failed to rename file %s to %s: 
%s",
-   ErrorMsgBuffer.c_str(),
-   file_spec.GetPath().c_str(),
-   ErrorMsgBuffer.c_str());
-   break;
- }
-   })) {
+  if (auto Err = handleErrors(file_spec.GetPath(),
+  [&socket_id](llvm::raw_ostream &OS) {

the call to llvm::writeToOutput is probably missed here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153740/new/

https://reviews.llvm.org/D153740

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D153652: [Support] Don't set "all_exe" mode by default for file written by llvm::writeToOutput

2023-06-26 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added inline comments.



Comment at: 
llvm/test/tools/llvm-dwarfutil/ELF/X86/mirror-permissions-unix.test:2
+## The Unix version of this test must use umask(1) because
+## llvm-darfutil respects the umask in setting output permissions.
+## Setting the umask to 0 ensures deterministic permissions across





Comment at: llvm/unittests/Support/raw_ostream_test.cpp:499
+  ErrorOr Perms = llvm::sys::fs::getPermissions(Path);
+  ASSERT_TRUE(!!Perms);
+  EXPECT_EQ(0, *Perms & llvm::sys::fs::all_exe);

!!  looks a bit unclear. Probably check it in more explicit way?

EXPECT_TRUE(Perms && !(*Perms & llvm::sys::fs::all_exe)); 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153652/new/

https://reviews.llvm.org/D153652

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D153652: [Support] Don't set "all_exe" mode by default for file written by llvm::writeToOutput

2023-06-28 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a comment.

this LGTM. please, wait if James has any concern.




Comment at: llvm/unittests/Support/raw_ostream_test.cpp:525
+  ASSERT_TRUE(Perms) << "should be able to get permissions";
+  // Verify that writeToOutput doesn't set exe bit.
+  EXPECT_EQ(Perms.get(), llvm::sys::fs::all_read | llvm::sys::fs::all_write);

nit: comment looks a bit inconsistent as we check for read&write bits.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153652/new/

https://reviews.llvm.org/D153652

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits