Re: r270962 - [OPENMP] Fixed processing of '-fopenmp-version=' option and test.

2017-10-02 Thread Hal Finkel via cfe-commits

Hi, Alexey,

At what point can we switch, by default, to reporting a version for 
_OPENMP corresponding to 4.x? We're missing out on some OpenMP simd 
directives because the source code guards them with '#if _OPENMP >= 
201307' or similar.


Thanks again,
Hal

On 05/26/2016 11:13 PM, Alexey Bataev via cfe-commits wrote:

Author: abataev
Date: Thu May 26 23:13:39 2016
New Revision: 270962

URL: http://llvm.org/viewvc/llvm-project?rev=270962&view=rev
Log:
[OPENMP] Fixed processing of '-fopenmp-version=' option and test.

Modified:
 cfe/trunk/lib/Driver/Tools.cpp
 cfe/trunk/lib/Frontend/CompilerInvocation.cpp
 cfe/trunk/lib/Frontend/InitPreprocessor.cpp
 cfe/trunk/test/OpenMP/driver.c

Modified: cfe/trunk/lib/Driver/Tools.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Tools.cpp?rev=270962&r1=270961&r2=270962&view=diff
==
--- cfe/trunk/lib/Driver/Tools.cpp (original)
+++ cfe/trunk/lib/Driver/Tools.cpp Thu May 26 23:13:39 2016
@@ -4864,7 +4864,6 @@ void Clang::ConstructJob(Compilation &C,
// Forward flags for OpenMP
if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ,
 options::OPT_fno_openmp, false)) {
-Args.AddAllArgs(CmdArgs, options::OPT_fopenmp_version_EQ);
  switch (getOpenMPRuntime(getToolChain(), Args)) {
  case OMPRT_OMP:
  case OMPRT_IOMP5:
@@ -4877,6 +4876,7 @@ void Clang::ConstructJob(Compilation &C,
if (!Args.hasFlag(options::OPT_fopenmp_use_tls,
  options::OPT_fnoopenmp_use_tls, /*Default=*/true))
  CmdArgs.push_back("-fnoopenmp-use-tls");
+  Args.AddAllArgs(CmdArgs, options::OPT_fopenmp_version_EQ);
break;
  default:
// By default, if Clang doesn't know how to generate useful OpenMP code

Modified: cfe/trunk/lib/Frontend/CompilerInvocation.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInvocation.cpp?rev=270962&r1=270961&r2=270962&view=diff
==
--- cfe/trunk/lib/Frontend/CompilerInvocation.cpp (original)
+++ cfe/trunk/lib/Frontend/CompilerInvocation.cpp Thu May 26 23:13:39 2016
@@ -1954,15 +1954,16 @@ static void ParseLangArgs(LangOptions &O
}
  
// Check if -fopenmp is specified.

-  Opts.OpenMP = Args.hasArg(options::OPT_fopenmp);
+  Opts.OpenMP = Args.hasArg(options::OPT_fopenmp) ? 1 : 0;
Opts.OpenMPUseTLS =
Opts.OpenMP && !Args.hasArg(options::OPT_fnoopenmp_use_tls);
Opts.OpenMPIsDevice =
Opts.OpenMP && Args.hasArg(options::OPT_fopenmp_is_device);
  
if (Opts.OpenMP) {

-if (int Version = getLastArgIntValue(Args, OPT_fopenmp_version_EQ,
- Opts.OpenMP, Diags))
+int Version =
+getLastArgIntValue(Args, OPT_fopenmp_version_EQ, Opts.OpenMP, Diags);
+if (Version != 0)
Opts.OpenMP = Version;
  // Provide diagnostic when a given target is not expected to be an OpenMP
  // device or host.

Modified: cfe/trunk/lib/Frontend/InitPreprocessor.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/InitPreprocessor.cpp?rev=270962&r1=270961&r2=270962&view=diff
==
--- cfe/trunk/lib/Frontend/InitPreprocessor.cpp (original)
+++ cfe/trunk/lib/Frontend/InitPreprocessor.cpp Thu May 26 23:13:39 2016
@@ -922,24 +922,24 @@ static void InitializePredefinedMacros(c
}
  
// OpenMP definition

-  if (LangOpts.OpenMP) {
-// OpenMP 2.2:
-//   In implementations that support a preprocessor, the _OPENMP
-//   macro name is defined to have the decimal value mm where
-//    and mm are the year and the month designations of the
-//   version of the OpenMP API that the implementation support.
-switch (LangOpts.OpenMP) {
-case 40:
-  Builder.defineMacro("_OPENMP", "201307");
-  break;
-case 45:
-  Builder.defineMacro("_OPENMP", "201511");
-  break;
-default:
-  // Default version is OpenMP 3.1
-  Builder.defineMacro("_OPENMP", "201107");
-  break;
-}
+  // OpenMP 2.2:
+  //   In implementations that support a preprocessor, the _OPENMP
+  //   macro name is defined to have the decimal value mm where
+  //    and mm are the year and the month designations of the
+  //   version of the OpenMP API that the implementation support.
+  switch (LangOpts.OpenMP) {
+  case 0:
+break;
+  case 40:
+Builder.defineMacro("_OPENMP", "201307");
+break;
+  case 45:
+Builder.defineMacro("_OPENMP", "201511");
+break;
+  default:
+// Default version is OpenMP 3.1
+Builder.defineMacro("_OPENMP", "201107");
+break;
}
  
// CUDA device path compilaton


Modified: cfe/trunk/test/OpenMP/driver.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/driver.c?rev=270962&r1=270961&r2=270962

Re: r270962 - [OPENMP] Fixed processing of '-fopenmp-version=' option and test.

2017-10-02 Thread Hal Finkel via cfe-commits


On 10/02/2017 07:08 PM, Alexey Bataev wrote:

Hi Hal,
As soon as we get the support for 4.5, including offloading. Otherwise 
there always are going to be some people blaming the compiler for not 
supporting 4.5 in full. Will try to support it ASAP.

Meanwhile, you can use -fopenmp-version=45 option to force to 4.5


Thanks!

Do we have a status page anywhere that shows where we stand on the 
various features? Are there still features that we don't parse, or where 
we don't generate something that conservatively correct (e.g., as with 
"declare simd")?


 -Hal



Best regards,
Alexey Bataev

2 окт. 2017 г., в 19:53, Hal Finkel > написал(а):



Hi, Alexey,

At what point can we switch, by default, to reporting a version for 
_OPENMP corresponding to 4.x? We're missing out on some OpenMP simd 
directives because the source code guards them with '#if _OPENMP >= 
201307' or similar.


Thanks again,
Hal

On 05/26/2016 11:13 PM, Alexey Bataev via cfe-commits wrote:

Author: abataev
Date: Thu May 26 23:13:39 2016
New Revision: 270962

URL: http://llvm.org/viewvc/llvm-project?rev=270962&view=rev
Log:
[OPENMP] Fixed processing of '-fopenmp-version=' option and test.

Modified:
cfe/trunk/lib/Driver/Tools.cpp
cfe/trunk/lib/Frontend/CompilerInvocation.cpp
cfe/trunk/lib/Frontend/InitPreprocessor.cpp
cfe/trunk/test/OpenMP/driver.c

Modified: cfe/trunk/lib/Driver/Tools.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Tools.cpp?rev=270962&r1=270961&r2=270962&view=diff

==
--- cfe/trunk/lib/Driver/Tools.cpp (original)
+++ cfe/trunk/lib/Driver/Tools.cpp Thu May 26 23:13:39 2016
@@ -4864,7 +4864,6 @@ void Clang::ConstructJob(Compilation &C,
   // Forward flags for OpenMP
   if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ,
options::OPT_fno_openmp, false)) {
-Args.AddAllArgs(CmdArgs, options::OPT_fopenmp_version_EQ);
 switch (getOpenMPRuntime(getToolChain(), Args)) {
 case OMPRT_OMP:
 case OMPRT_IOMP5:
@@ -4877,6 +4876,7 @@ void Clang::ConstructJob(Compilation &C,
   if (!Args.hasFlag(options::OPT_fopenmp_use_tls,
 options::OPT_fnoopenmp_use_tls, 
/*Default=*/true))

 CmdArgs.push_back("-fnoopenmp-use-tls");
+  Args.AddAllArgs(CmdArgs, options::OPT_fopenmp_version_EQ);
   break;
 default:
   // By default, if Clang doesn't know how to generate useful 
OpenMP code


Modified: cfe/trunk/lib/Frontend/CompilerInvocation.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInvocation.cpp?rev=270962&r1=270961&r2=270962&view=diff

==
--- cfe/trunk/lib/Frontend/CompilerInvocation.cpp (original)
+++ cfe/trunk/lib/Frontend/CompilerInvocation.cpp Thu May 26 
23:13:39 2016

@@ -1954,15 +1954,16 @@ static void ParseLangArgs(LangOptions &O
   }
 // Check if -fopenmp is specified.
-  Opts.OpenMP = Args.hasArg(options::OPT_fopenmp);
+  Opts.OpenMP = Args.hasArg(options::OPT_fopenmp) ? 1 : 0;
   Opts.OpenMPUseTLS =
   Opts.OpenMP && !Args.hasArg(options::OPT_fnoopenmp_use_tls);
   Opts.OpenMPIsDevice =
   Opts.OpenMP && Args.hasArg(options::OPT_fopenmp_is_device);
 if (Opts.OpenMP) {
-if (int Version = getLastArgIntValue(Args, OPT_fopenmp_version_EQ,
- Opts.OpenMP, Diags))
+int Version =
+getLastArgIntValue(Args, OPT_fopenmp_version_EQ, 
Opts.OpenMP, Diags);

+if (Version != 0)
   Opts.OpenMP = Version;
 // Provide diagnostic when a given target is not expected to be 
an OpenMP

 // device or host.

Modified: cfe/trunk/lib/Frontend/InitPreprocessor.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/InitPreprocessor.cpp?rev=270962&r1=270961&r2=270962&view=diff

==
--- cfe/trunk/lib/Frontend/InitPreprocessor.cpp (original)
+++ cfe/trunk/lib/Frontend/InitPreprocessor.cpp Thu May 26 23:13:39 2016
@@ -922,24 +922,24 @@ static void InitializePredefinedMacros(c
   }
 // OpenMP definition
-  if (LangOpts.OpenMP) {
-// OpenMP 2.2:
-//   In implementations that support a preprocessor, the _OPENMP
-//   macro name is defined to have the decimal value mm where
-//    and mm are the year and the month designations of the
-//   version of the OpenMP API that the implementation support.
-switch (LangOpts.OpenMP) {
-case 40:
-  Builder.defineMacro("_OPENMP", "201307");
-  break;
-case 45:
-  Builder.defineMacro("_OPENMP", "201511");
-  break;
-default:
-  // Default version is OpenMP 3.1
-  Builder.defineMacro("_OPENMP", "201107");
-  break;
-}
+  // OpenMP 2.2:
+  //   In implementations that support a preprocessor, the _OPENMP
+  //   macro name is defined to have the

Re: r270962 - [OPENMP] Fixed processing of '-fopenmp-version=' option and test.

2017-10-02 Thread Hal Finkel via cfe-commits


On 10/02/2017 07:38 PM, Alexey Bataev wrote:
No, there is no such page. We parse everything from 4.5, but have very 
limited support in codegen for target-specific directives, especially 
combined one. Moreover, some of them are not implemented at all and we 
may produce incorrect code. I can try to revisit these "badly" 
supported constructs and provide some basic codegen to produce working 
code at least, though without actual offloading. But not sure that I 
will be able to do it quickly.


I think that it would be useful to have a status page for OpenMP. 
Something like https://clang.llvm.org/cxx_status.html. If you could make 
something like that (you're probably the best person to do it), that 
would be great.


We might get other people to help contribute to some of the missing 
areas - I suspect that this is easier if we can enumerate them.


Thanks again,
Hal



Best regards,
Alexey Bataev

2 окт. 2017 г., в 20:22, Hal Finkel > написал(а):




On 10/02/2017 07:08 PM, Alexey Bataev wrote:

Hi Hal,
As soon as we get the support for 4.5, including offloading. 
Otherwise there always are going to be some people blaming the 
compiler for not supporting 4.5 in full. Will try to support it ASAP.

Meanwhile, you can use -fopenmp-version=45 option to force to 4.5


Thanks!

Do we have a status page anywhere that shows where we stand on the 
various features? Are there still features that we don't parse, or 
where we don't generate something that conservatively correct (e.g., 
as with "declare simd")?


 -Hal



Best regards,
Alexey Bataev

2 окт. 2017 г., в 19:53, Hal Finkel > написал(а):



Hi, Alexey,

At what point can we switch, by default, to reporting a version for 
_OPENMP corresponding to 4.x? We're missing out on some OpenMP simd 
directives because the source code guards them with '#if _OPENMP >= 
201307' or similar.


Thanks again,
Hal

On 05/26/2016 11:13 PM, Alexey Bataev via cfe-commits wrote:

Author: abataev
Date: Thu May 26 23:13:39 2016
New Revision: 270962

URL: http://llvm.org/viewvc/llvm-project?rev=270962&view=rev
Log:
[OPENMP] Fixed processing of '-fopenmp-version=' option and test.

Modified:
cfe/trunk/lib/Driver/Tools.cpp
cfe/trunk/lib/Frontend/CompilerInvocation.cpp
cfe/trunk/lib/Frontend/InitPreprocessor.cpp
cfe/trunk/test/OpenMP/driver.c

Modified: cfe/trunk/lib/Driver/Tools.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Tools.cpp?rev=270962&r1=270961&r2=270962&view=diff

==
--- cfe/trunk/lib/Driver/Tools.cpp (original)
+++ cfe/trunk/lib/Driver/Tools.cpp Thu May 26 23:13:39 2016
@@ -4864,7 +4864,6 @@ void Clang::ConstructJob(Compilation &C,
   // Forward flags for OpenMP
   if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ,
options::OPT_fno_openmp, false)) {
-Args.AddAllArgs(CmdArgs, options::OPT_fopenmp_version_EQ);
 switch (getOpenMPRuntime(getToolChain(), Args)) {
 case OMPRT_OMP:
 case OMPRT_IOMP5:
@@ -4877,6 +4876,7 @@ void Clang::ConstructJob(Compilation &C,
   if (!Args.hasFlag(options::OPT_fopenmp_use_tls,
 options::OPT_fnoopenmp_use_tls, 
/*Default=*/true))

 CmdArgs.push_back("-fnoopenmp-use-tls");
+  Args.AddAllArgs(CmdArgs, options::OPT_fopenmp_version_EQ);
   break;
 default:
   // By default, if Clang doesn't know how to generate useful 
OpenMP code


Modified: cfe/trunk/lib/Frontend/CompilerInvocation.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInvocation.cpp?rev=270962&r1=270961&r2=270962&view=diff

==
--- cfe/trunk/lib/Frontend/CompilerInvocation.cpp (original)
+++ cfe/trunk/lib/Frontend/CompilerInvocation.cpp Thu May 26 
23:13:39 2016

@@ -1954,15 +1954,16 @@ static void ParseLangArgs(LangOptions &O
   }
 // Check if -fopenmp is specified.
-  Opts.OpenMP = Args.hasArg(options::OPT_fopenmp);
+  Opts.OpenMP = Args.hasArg(options::OPT_fopenmp) ? 1 : 0;
   Opts.OpenMPUseTLS =
   Opts.OpenMP && !Args.hasArg(options::OPT_fnoopenmp_use_tls);
   Opts.OpenMPIsDevice =
   Opts.OpenMP && Args.hasArg(options::OPT_fopenmp_is_device);
 if (Opts.OpenMP) {
-if (int Version = getLastArgIntValue(Args, 
OPT_fopenmp_version_EQ,

- Opts.OpenMP, Diags))
+int Version =
+getLastArgIntValue(Args, OPT_fopenmp_version_EQ, 
Opts.OpenMP, Diags);

+if (Version != 0)
   Opts.OpenMP = Version;
 // Provide diagnostic when a given target is not expected to 
be an OpenMP

 // device or host.

Modified: cfe/trunk/lib/Frontend/InitPreprocessor.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/InitPreprocessor.cpp?rev=270962&r1=270961&r2=270962&view=diff

==

Re: Attribute spelling policy

2017-10-23 Thread Hal Finkel via cfe-commits


On 10/21/2017 10:14 AM, Aaron Ballman via cfe-commits wrote:

Attributes come with multiple spelling flavors, but when it comes to
adding new attributes that are not present in other compiler tools
such as GCC or MSVC, we have done a poor job of being consistent with
which spelling flavors we adopt the attributes under. Some of our
attributes are specified with only an __attribute__ spelling (about
100), while others are specified with both __attribute__ and
[[clang::XXX]] (about 30), and still others are specified as only
[[clang::XXX]] attributes (only 1). This puts additional burden on
developers to remember which attributes are spelled with what syntax
and the various rules surrounding how to write attributes with that
spelling.

I am proposing that we take a more principled approach when adding new
attributes so that we provide a better user experience. Specifically,
when adding an attribute that other vendors do not support, the
attribute should be given an __attribute__ and [[clang::]] spelling
unless there's good reason not to. This is not a novel proposal -- GCC
supports all of their documented __attribute__ spellings under a
[[gnu::XXX]] spelling, and I am proposing we do the same with our
vendor namespace.


For attributes that both Clang and GCC support, where GCC provides a 
[[gnu::X]] syntax, do you propose that our policy will be to support the 
same?




Assuming this approach is reasonable to the community,
  I will add a
CLANG spelling that behaves similar to the GCC spelling in that it
automatically provides both the GNU and CXX11 spellings as
appropriate. There are some attributes for which a [[clang::XXX]]
spelling is not appropriate:
   * attributes that appertain to function declarations but require
accessing the function parameters, such as disable_if or
requires_capability


Is this restriction related to the change that p0542 proposes to make to 
the interpretation of attributes that appear after functions as part of 
the contracts proposal?


Thanks again,
Hal


   * attributes with GNU spellings whose use is discouraged or
deprecated, such as no_sanitize_memory
   * attributes that are part of other vendor specifications, like CUDA or 
OpenCL
These deviations are reasonable, but should be documented in Attr.td
near the Spelling definition for the attribute so that it's explicitly
understood why the spelling differs.

Additionally, I intend for the proposed CLANG spelling to be extended
in the future to more easily expose [[clang::XXX]] spellings for
attributes intended to be used in C (with
-fdouble-square-bracket-attributes) as well as C++.

As always, feedback is welcome!

~Aaron
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r336467 - [OPENMP] Fix PR38026: Link -latomic when -fopenmp is used.

2018-07-19 Thread Hal Finkel via cfe-commits

On 07/16/2018 01:19 PM, Jonas Hahnfeld wrote:
> [ Moving discussion from https://reviews.llvm.org/D49386 to the
> relevant comment on cfe-commits, CC'ing Hal who commented on the
> original issue ]
>
> Is this change really a good idea? It always requires libatomic for
> all OpenMP applications, even if there is no 'omp atomic' directive or
> all of them can be lowered to atomic instructions that don't require a
> runtime library. I'd argue that it's a larger restriction than the
> problem it solves.

Can you please elaborate on why you feel that this is problematic?

> Per https://clang.llvm.org/docs/Toolchain.html#libatomic-gnu the user
> is expected to manually link -latomic whenever Clang can't lower
> atomic instructions - including C11 atomics and C++ atomics. In my
> opinion OpenMP is just another abstraction that doesn't require a
> special treatment.

From my perspective, because we instruct our users that all you need to
do in order to enable OpenMP is pass -fopenmp flags during compiling and
linking. The user should not need to know or care about how atomics are
implemented.

It's not clear to me that our behavior for C++ atomics is good either.
From the documentation, it looks like the rationale is to avoid choosing
between the GNU libatomic implementation and the compiler-rt
implementation? We should probably make a default choice and provide a
flag to override. That would seem more user-friendly to me.

 -Hal

>
> Thoughts?
> Jonas
>
> On 2018-07-06 23:13, Alexey Bataev via cfe-commits wrote:
>> Author: abataev
>> Date: Fri Jul  6 14:13:41 2018
>> New Revision: 336467
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=336467&view=rev
>> Log:
>> [OPENMP] Fix PR38026: Link -latomic when -fopenmp is used.
>>
>> On Linux atomic constructs in OpenMP require libatomic library. Patch
>> links libatomic when -fopenmp is used.
>>
>> Modified:
>>     cfe/trunk/lib/Driver/ToolChains/Gnu.cpp
>>     cfe/trunk/test/OpenMP/linking.c
>>
>> Modified: cfe/trunk/lib/Driver/ToolChains/Gnu.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Gnu.cpp?rev=336467&r1=336466&r2=336467&view=diff
>>
>> ==
>>
>> --- cfe/trunk/lib/Driver/ToolChains/Gnu.cpp (original)
>> +++ cfe/trunk/lib/Driver/ToolChains/Gnu.cpp Fri Jul  6 14:13:41 2018
>> @@ -479,6 +479,7 @@ void tools::gnutools::Linker::ConstructJ
>>
>>    bool WantPthread = Args.hasArg(options::OPT_pthread) ||
>>   Args.hasArg(options::OPT_pthreads);
>> +  bool WantAtomic = false;
>>
>>    // FIXME: Only pass GompNeedsRT = true for platforms with
>> libgomp that
>>    // require librt. Most modern Linux platforms do, but some may
>> not.
>> @@ -487,13 +488,16 @@ void tools::gnutools::Linker::ConstructJ
>>     /* GompNeedsRT= */ true))
>>  // OpenMP runtimes implies pthreads when using the GNU
>> toolchain.
>>  // FIXME: Does this really make sense for all GNU toolchains?
>> -    WantPthread = true;
>> +    WantAtomic = WantPthread = true;
>>
>>    AddRunTimeLibs(ToolChain, D, CmdArgs, Args);
>>
>>    if (WantPthread && !isAndroid)
>>  CmdArgs.push_back("-lpthread");
>>
>> +  if (WantAtomic)
>> +    CmdArgs.push_back("-latomic");
>> +
>>    if (Args.hasArg(options::OPT_fsplit_stack))
>>  CmdArgs.push_back("--wrap=pthread_create");
>>
>>
>> Modified: cfe/trunk/test/OpenMP/linking.c
>> URL:
>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/linking.c?rev=336467&r1=336466&r2=336467&view=diff
>>
>> ==
>>
>> --- cfe/trunk/test/OpenMP/linking.c (original)
>> +++ cfe/trunk/test/OpenMP/linking.c Fri Jul  6 14:13:41 2018
>> @@ -8,14 +8,14 @@
>>  // RUN:   | FileCheck --check-prefix=CHECK-LD-32 %s
>>  // CHECK-LD-32: "{{.*}}ld{{(.exe)?}}"
>>  // CHECK-LD-32: "-l[[DEFAULT_OPENMP_LIB:[^"]*]]"
>> -// CHECK-LD-32: "-lpthread" "-lc"
>> +// CHECK-LD-32: "-lpthread" "-latomic" "-lc"
>>  //
>>  // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \
>>  // RUN: -fopenmp -target x86_64-unknown-linux -rtlib=platform \
>>  // RUN:   | FileCheck --check-prefix=CHECK-LD-64 %s
>>  // CHECK-LD-64: "{{.*}}ld{{(.exe)?}}"
>>  // CHECK-LD-64: "-l[[DEFAULT_OPENMP_LIB:[^"]*]]"
>> -// CHECK-LD-64: "-lpthread" "-lc"
>> +// CHECK-LD-64: "-lpthread" "-latomic" "-lc"
>>  //
>>  // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \
>>  // RUN: -fopenmp=libgomp -target i386-unknown-linux
>> -rtlib=platform \
>> @@ -27,7 +27,7 @@
>>  // SIMD-ONLY2-NOT: liomp
>>  // CHECK-GOMP-LD-32: "{{.*}}ld{{(.exe)?}}"
>>  // CHECK-GOMP-LD-32: "-lgomp" "-lrt"
>> -// CHECK-GOMP-LD-32: "-lpthread" "-lc"
>> +// CHECK-GOMP-LD-32: "-lpthread" "-latomic" "-lc"
>>
>>  // RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1
>> -fopenmp-simd -target i386-unknown-linux -rtlib=platform | FileCheck
>

Re: r336467 - [OPENMP] Fix PR38026: Link -latomic when -fopenmp is used.

2018-07-19 Thread Hal Finkel via cfe-commits


On 07/19/2018 09:01 AM, Jonas Hahnfeld wrote:
> On 2018-07-19 15:43, Hal Finkel wrote:
>> On 07/16/2018 01:19 PM, Jonas Hahnfeld wrote:
>>> [ Moving discussion from https://reviews.llvm.org/D49386 to the
>>> relevant comment on cfe-commits, CC'ing Hal who commented on the
>>> original issue ]
>>>
>>> Is this change really a good idea? It always requires libatomic for
>>> all OpenMP applications, even if there is no 'omp atomic' directive or
>>> all of them can be lowered to atomic instructions that don't require a
>>> runtime library. I'd argue that it's a larger restriction than the
>>> problem it solves.
>>
>> Can you please elaborate on why you feel that this is problematic?
>
> The linked patch deals with the case that there is no libatomic,
> effectively disabling all tests of the OpenMP runtime (even though
> only few of them require atomic instructions). So apparently there are
> Linux systems without libatomic. Taking them any chance to use OpenMP
> with Clang is a large regression IMO and not user-friendly either.

If there's a significant population of such systems, then this certainly
seems like a problem.

Let's revert this for now while we figure out what to do (which might
just mean updating the documentation to include OpenMP where we talk
about atomics).

>
>>> Per https://clang.llvm.org/docs/Toolchain.html#libatomic-gnu the user
>>> is expected to manually link -latomic whenever Clang can't lower
>>> atomic instructions - including C11 atomics and C++ atomics. In my
>>> opinion OpenMP is just another abstraction that doesn't require a
>>> special treatment.
>>
>> From my perspective, because we instruct our users that all you need to
>> do in order to enable OpenMP is pass -fopenmp flags during compiling and
>> linking. The user should not need to know or care about how atomics are
>> implemented.
>>
>> It's not clear to me that our behavior for C++ atomics is good either.
>> From the documentation, it looks like the rationale is to avoid choosing
>> between the GNU libatomic implementation and the compiler-rt
>> implementation? We should probably make a default choice and provide a
>> flag to override. That would seem more user-friendly to me.
>
> I didn't mean to say it's a good default, but OpenMP is now different
> from C and C++. And as you said, the choice was probably made for a
> reason, so there should be some discussion whether to change it.

Agreed.

 -Hal

>
> Jonas

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r320904 - [TextDiagnosticBuffer] Fix diagnostic note emission order

2017-12-15 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Fri Dec 15 17:40:19 2017
New Revision: 320904

URL: http://llvm.org/viewvc/llvm-project?rev=320904&view=rev
Log:
[TextDiagnosticBuffer] Fix diagnostic note emission order

The frontend currently groups diagnostics from the command line according to
diagnostic level, but that places all notes last. Fix that by emitting such
diagnostics in the order they were generated.

Patch by Joel E. Denny, thanks!

Differential Revision: https://reviews.llvm.org/D40995

Added:
cfe/trunk/test/Frontend/diagnostics-order.c
Modified:
cfe/trunk/include/clang/Frontend/TextDiagnosticBuffer.h
cfe/trunk/lib/Frontend/TextDiagnosticBuffer.cpp

Modified: cfe/trunk/include/clang/Frontend/TextDiagnosticBuffer.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Frontend/TextDiagnosticBuffer.h?rev=320904&r1=320903&r2=320904&view=diff
==
--- cfe/trunk/include/clang/Frontend/TextDiagnosticBuffer.h (original)
+++ cfe/trunk/include/clang/Frontend/TextDiagnosticBuffer.h Fri Dec 15 17:40:19 
2017
@@ -29,6 +29,11 @@ public:
   typedef DiagList::const_iterator const_iterator;
 private:
   DiagList Errors, Warnings, Remarks, Notes;
+  /// All - All diagnostics in the order in which they were generated.  That
+  /// order likely doesn't correspond to user input order, but it at least
+  /// keeps notes in the right places.  Each pair in the vector is a diagnostic
+  /// level and an index into the corresponding DiagList above.
+  std::vector> All;
 public:
   const_iterator err_begin() const  { return Errors.begin(); }
   const_iterator err_end() const{ return Errors.end(); }

Modified: cfe/trunk/lib/Frontend/TextDiagnosticBuffer.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/TextDiagnosticBuffer.cpp?rev=320904&r1=320903&r2=320904&view=diff
==
--- cfe/trunk/lib/Frontend/TextDiagnosticBuffer.cpp (original)
+++ cfe/trunk/lib/Frontend/TextDiagnosticBuffer.cpp Fri Dec 15 17:40:19 2017
@@ -30,34 +30,45 @@ void TextDiagnosticBuffer::HandleDiagnos
   default: llvm_unreachable(
  "Diagnostic not handled during diagnostic 
buffering!");
   case DiagnosticsEngine::Note:
+All.emplace_back(Level, Notes.size());
 Notes.emplace_back(Info.getLocation(), Buf.str());
 break;
   case DiagnosticsEngine::Warning:
+All.emplace_back(Level, Warnings.size());
 Warnings.emplace_back(Info.getLocation(), Buf.str());
 break;
   case DiagnosticsEngine::Remark:
+All.emplace_back(Level, Remarks.size());
 Remarks.emplace_back(Info.getLocation(), Buf.str());
 break;
   case DiagnosticsEngine::Error:
   case DiagnosticsEngine::Fatal:
+All.emplace_back(Level, Errors.size());
 Errors.emplace_back(Info.getLocation(), Buf.str());
 break;
   }
 }
 
 void TextDiagnosticBuffer::FlushDiagnostics(DiagnosticsEngine &Diags) const {
-  // FIXME: Flush the diagnostics in order.
-  for (const_iterator it = err_begin(), ie = err_end(); it != ie; ++it)
-Diags.Report(Diags.getCustomDiagID(DiagnosticsEngine::Error, "%0"))
-<< it->second;
-  for (const_iterator it = warn_begin(), ie = warn_end(); it != ie; ++it)
-Diags.Report(Diags.getCustomDiagID(DiagnosticsEngine::Warning, "%0"))
-<< it->second;
-  for (const_iterator it = remark_begin(), ie = remark_end(); it != ie; ++it)
-Diags.Report(Diags.getCustomDiagID(DiagnosticsEngine::Remark, "%0"))
-<< it->second;
-  for (const_iterator it = note_begin(), ie = note_end(); it != ie; ++it)
-Diags.Report(Diags.getCustomDiagID(DiagnosticsEngine::Note, "%0"))
-<< it->second;
+  for (auto it = All.begin(), ie = All.end(); it != ie; ++it) {
+auto Diag = Diags.Report(Diags.getCustomDiagID(it->first, "%0"));
+switch (it->first) {
+default: llvm_unreachable(
+   "Diagnostic not handled during diagnostic 
flushing!");
+case DiagnosticsEngine::Note:
+  Diag << Notes[it->second].second;
+  break;
+case DiagnosticsEngine::Warning:
+  Diag << Warnings[it->second].second;
+  break;
+case DiagnosticsEngine::Remark:
+  Diag << Remarks[it->second].second;
+  break;
+case DiagnosticsEngine::Error:
+case DiagnosticsEngine::Fatal:
+  Diag << Errors[it->second].second;
+  break;
+}
+  }
 }
 

Added: cfe/trunk/test/Frontend/diagnostics-order.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Frontend/diagnostics-order.c?rev=320904&view=auto
==
--- cfe/trunk/test/Frontend/diagnostics-order.c (added)
+++ cfe/trunk/test/Frontend/diagnostics-order.c Fri Dec 15 17:40:19 2017
@@ -0,0 +1,10 @@
+// Make sure a note stays with its associated command-line argument diagnostic.
+// Previously, these diagnostics were grouped by diagnostic level with all
+// notes last.
+//
+

r320908 - [VerifyDiagnosticConsumer] support -verify=

2017-12-15 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Fri Dec 15 18:23:22 2017
New Revision: 320908

URL: http://llvm.org/viewvc/llvm-project?rev=320908&view=rev
Log:
[VerifyDiagnosticConsumer] support -verify=

This mimics FileCheck's --check-prefixes option.

The default prefix is "expected". That is, "-verify" is equivalent to
"-verify=expected".

The goal is to permit exercising a single test suite source file with different
compiler options producing different sets of diagnostics.  While cpp can be
combined with the existing -verify to accomplish the same goal, source is often
easier to maintain when it's not cluttered with preprocessor directives or
duplicate passages of code. For example, this patch also rewrites some existing
clang tests to demonstrate the benefit of this feature.

Patch by Joel E. Denny, thanks!

Differential Revision: https://reviews.llvm.org/D39694

Added:
cfe/trunk/test/Frontend/verify-prefixes.c
Modified:
cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
cfe/trunk/include/clang/Basic/DiagnosticOptions.h
cfe/trunk/include/clang/Driver/CC1Options.td
cfe/trunk/lib/Frontend/CompilerInvocation.cpp
cfe/trunk/lib/Frontend/VerifyDiagnosticConsumer.cpp
cfe/trunk/test/Frontend/diagnostics-order.c
cfe/trunk/test/Sema/tautological-unsigned-enum-zero-compare.c
cfe/trunk/test/Sema/tautological-unsigned-enum-zero-compare.cpp
cfe/trunk/test/Sema/tautological-unsigned-zero-compare.c

Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td?rev=320908&r1=320907&r2=320908&view=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td Fri Dec 15 18:23:22 
2017
@@ -338,4 +338,8 @@ def warn_drv_msvc_not_found : Warning<
 def warn_drv_fine_grained_bitfield_accesses_ignored : Warning<
   "option '-ffine-grained-bitfield-accesses' cannot be enabled together with a 
sanitizer; flag ignored">,
   InGroup;
+
+def note_drv_verify_prefix_spelling : Note<
+  "-verify prefixes must start with a letter and contain only alphanumeric"
+  " characters, hyphens, and underscores">;
 }

Modified: cfe/trunk/include/clang/Basic/DiagnosticOptions.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticOptions.h?rev=320908&r1=320907&r2=320908&view=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticOptions.h (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticOptions.h Fri Dec 15 18:23:22 2017
@@ -100,6 +100,10 @@ public:
   /// prefixes removed.
   std::vector Remarks;
 
+  /// The prefixes for comment directives sought by -verify ("expected" by
+  /// default).
+  std::vector VerifyPrefixes;
+
 public:
   // Define accessors/mutators for diagnostic options of enumeration type.
 #define DIAGOPT(Name, Bits, Default)

Modified: cfe/trunk/include/clang/Driver/CC1Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/CC1Options.td?rev=320908&r1=320907&r2=320908&view=diff
==
--- cfe/trunk/include/clang/Driver/CC1Options.td (original)
+++ cfe/trunk/include/clang/Driver/CC1Options.td Fri Dec 15 18:23:22 2017
@@ -400,8 +400,12 @@ def fcaret_diagnostics_max_lines :
   HelpText<"Set the maximum number of source lines to show in a caret 
diagnostic">;
 def fmessage_length : Separate<["-"], "fmessage-length">, MetaVarName<"">,
   HelpText<"Format message diagnostics so that they fit within N columns or 
fewer, when possible.">;
+def verify_EQ : CommaJoined<["-"], "verify=">,
+  MetaVarName<"">,
+  HelpText<"Verify diagnostic output using comment directives that start with"
+   " prefixes in the comma-separated sequence ">;
 def verify : Flag<["-"], "verify">,
-  HelpText<"Verify diagnostic output using comment directives">;
+  HelpText<"Equivalent to -verify=expected">;
 def verify_ignore_unexpected : Flag<["-"], "verify-ignore-unexpected">,
   HelpText<"Ignore unexpected diagnostic messages">;
 def verify_ignore_unexpected_EQ : CommaJoined<["-"], 
"verify-ignore-unexpected=">,

Modified: cfe/trunk/lib/Frontend/CompilerInvocation.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInvocation.cpp?rev=320908&r1=320907&r2=320908&view=diff
==
--- cfe/trunk/lib/Frontend/CompilerInvocation.cpp (original)
+++ cfe/trunk/lib/Frontend/CompilerInvocation.cpp Fri Dec 15 18:23:22 2017
@@ -1071,6 +1071,26 @@ static bool parseShowColorsArgs(const Ar
   llvm::sys::Process::StandardErrHasColors());
 }
 
+static bool checkVerifyPrefixes(const std::vector &VerifyPrefixes,
+DiagnosticsEngine *Diags) {
+  bool Success = true;
+ 

Re: r321239 - Fix for PR32990

2017-12-20 Thread Hal Finkel via cfe-commits


On 12/20/2017 08:07 PM, Erich Keane via cfe-commits wrote:

Author: erichkeane
Date: Wed Dec 20 18:07:46 2017
New Revision: 321239

URL: http://llvm.org/viewvc/llvm-project?rev=321239&view=rev
Log:
Fix for PR32990

This fixes the bug in https://bugs.llvm.org/show_bug.cgi?id=32990.


Too late now, but "Fix for PR32990" is not a useful subject, and only 
the bug reference is not a useful commit message. Commits should 
independently describe the problem and the solution. "Fixes PR32990." 
should be only a part of the message.


Thanks again,
Hal



Patch By: zahiraam
Differential Revision: https://reviews.llvm.org/D39063

Added:
 cfe/trunk/test/CodeGenCXX/dllimport-virtual-base.cpp
 cfe/trunk/test/CodeGenCXX/external-linkage.cpp
Modified:
 cfe/trunk/lib/CodeGen/CodeGenModule.cpp
 cfe/trunk/test/CodeGenCXX/dllimport-dtor-thunks.cpp
 cfe/trunk/test/CodeGenCXX/dllimport-members.cpp
 cfe/trunk/test/CodeGenCXX/dllimport.cpp

Modified: cfe/trunk/lib/CodeGen/CodeGenModule.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenModule.cpp?rev=321239&r1=321238&r2=321239&view=diff
==
--- cfe/trunk/lib/CodeGen/CodeGenModule.cpp (original)
+++ cfe/trunk/lib/CodeGen/CodeGenModule.cpp Wed Dec 20 18:07:46 2017
@@ -855,14 +855,25 @@ CodeGenModule::getFunctionLinkage(Global
GVALinkage Linkage = getContext().GetGVALinkageForFunction(D);
  
if (isa(D) &&

-  getCXXABI().useThunkForDtorVariant(cast(D),
- GD.getDtorType())) {
-// Destructor variants in the Microsoft C++ ABI are always internal or
-// linkonce_odr thunks emitted on an as-needed basis.
-return Linkage == GVA_Internal ? llvm::GlobalValue::InternalLinkage
-   : llvm::GlobalValue::LinkOnceODRLinkage;
+  Context.getTargetInfo().getCXXABI().isMicrosoft()) {
+switch (GD.getDtorType()) {
+case CXXDtorType::Dtor_Base:
+  break;
+case CXXDtorType::Dtor_Comdat:
+case CXXDtorType::Dtor_Complete:
+  if (D->hasAttr() &&
+ (cast(D)->getParent()->getNumVBases() ||
+  (Linkage == GVA_AvailableExternally ||
+   Linkage == GVA_StrongExternal)))
+   return llvm::Function::AvailableExternallyLinkage;
+  else
+return Linkage == GVA_Internal ? llvm::GlobalValue::InternalLinkage
+   : llvm::GlobalValue::LinkOnceODRLinkage;
+case CXXDtorType::Dtor_Deleting:
+  return Linkage == GVA_Internal ? llvm::GlobalValue::InternalLinkage
+ : llvm::GlobalValue::LinkOnceODRLinkage;
+}
}
-
if (isa(D) &&
cast(D)->isInheritingConstructor() &&
Context.getTargetInfo().getCXXABI().isMicrosoft()) {
@@ -878,12 +889,25 @@ CodeGenModule::getFunctionLinkage(Global
  void CodeGenModule::setFunctionDLLStorageClass(GlobalDecl GD, llvm::Function 
*F) {
const auto *FD = cast(GD.getDecl());
  
-  if (const auto *Dtor = dyn_cast_or_null(FD)) {

-if (getCXXABI().useThunkForDtorVariant(Dtor, GD.getDtorType())) {
+  if (dyn_cast_or_null(FD)) {
+switch (GD.getDtorType()) {
+case CXXDtorType::Dtor_Comdat:
+case CXXDtorType::Dtor_Deleting: {
// Don't dllexport/import destructor thunks.
F->setDLLStorageClass(llvm::GlobalValue::DefaultStorageClass);
return;
  }
+case CXXDtorType::Dtor_Complete:
+  if (FD->hasAttr())
+F->setDLLStorageClass(llvm::GlobalVariable::DLLImportStorageClass);
+  else if (FD->hasAttr())
+F->setDLLStorageClass(llvm::GlobalVariable::DLLExportStorageClass);
+  else
+F->setDLLStorageClass(llvm::GlobalVariable::DefaultStorageClass);
+  return;
+case CXXDtorType::Dtor_Base:
+  break;
+}
}
  
if (FD->hasAttr())


Modified: cfe/trunk/test/CodeGenCXX/dllimport-dtor-thunks.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCXX/dllimport-dtor-thunks.cpp?rev=321239&r1=321238&r2=321239&view=diff
==
--- cfe/trunk/test/CodeGenCXX/dllimport-dtor-thunks.cpp (original)
+++ cfe/trunk/test/CodeGenCXX/dllimport-dtor-thunks.cpp Wed Dec 20 18:07:46 2017
@@ -1,4 +1,5 @@
  // RUN: %clang_cc1 -mconstructor-aliases %s -triple x86_64-windows-msvc 
-fms-extensions -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 -mconstructor-aliases %s -triple x86_64-windows-msvc 
-fms-extensions -emit-llvm -O1 -disable-llvm-passes -o - | FileCheck 
--check-prefix=MO1 %s
  
  // FIXME: We should really consider removing -mconstructor-aliases for MS C++

  // ABI. The risk of bugs introducing ABI incompatibility under
@@ -23,9 +24,7 @@ struct __declspec(dllimport) ImportOverr
virtual ~ImportOverrideVDtor() {}
  };
  
-// Virtually inherits from a non-dllimport base class. This time we need to call

-// the complete destructor and emit it inline. It's not 

Re: [PATCH] D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type

2017-09-02 Thread Hal Finkel via cfe-commits


On 08/22/2017 10:56 PM, Wei Mi via llvm-commits wrote:

On Tue, Aug 22, 2017 at 7:03 PM, Xinliang David Li  wrote:


On Tue, Aug 22, 2017 at 6:37 PM, Chandler Carruth via Phabricator
 wrote:

chandlerc added a comment.

I'm really not a fan of the degree of complexity and subtlety that this
introduces into the frontend, all to allow particular backend optimizations.

I feel like this is Clang working around a fundamental deficiency in LLVM
and we should instead find a way to fix this in LLVM itself.

As has been pointed out before, user code can synthesize large integers
that small bit sequences are extracted from, and Clang and LLVM should
handle those just as well as actual bitfields.

Can we see how far we can push the LLVM side before we add complexity to
Clang here? I understand that there remain challenges to LLVM's stuff, but I
don't think those challenges make *all* of the LLVM improvements off the
table, I don't think we've exhausted all ways of improving the LLVM changes
being proposed, and I think we should still land all of those and
re-evaluate how important these issues are when all of that is in place.


The main challenge of doing  this in LLVM is that inter-procedural analysis
(and possibly cross module) is needed (for store forwarding issues).

Wei, perhaps you can provide concrete test case to illustrate the issue so
that reviewers have a good understanding.

David

Here is a runable testcase:
 1.cc 
class A {
public:
   unsigned long f1:2;
   unsigned long f2:6;
   unsigned long f3:8;
   unsigned long f4:4;
};
A a;
unsigned long b;
unsigned long N = 10;

__attribute__((noinline))
void foo() {
   a.f3 = 3;
}

__attribute__((noinline))
void goo() {
   b = a.f3;
}

int main() {
   unsigned long i;
   for (i = 0; i < N; i++) {
 foo();
 goo();
   }
}

Now trunk takes about twice running time compared with trunk + this
patch. That is because trunk shrinks the store of a.f3 in foo (Done by
DagCombiner) but not shrink the load of a.f3 in goo, so store
forwarding will be blocked.


I can confirm that this is true on Haswell and also on an POWER8. 
Interestingly, on a POWER7, the performance is the same either way (on 
the good side). I ran the test case as presented and where I replaced f3 
with a non-bitfield unsigned char member. Thinking that the POWER7 
result might be because it's big-Endian, I flipped the order of the 
fields, and found that the version where f3 is not a bitfield is faster 
than otherwise, but only by 12.5%.


Why, in this case, don't we shrink the load? It seems like we should 
(and it seems like a straightforward case).


Thanks again,
Hal



The testcases shows the potential problem of store shrinking. Before
we decide to do store shrinking, we need to know all the related loads
will be shrunk,  and that requires IPA analysis. Otherwise, when load
shrinking was blocked for some difficult case (Like the instcombine
case described in
https://www.mail-archive.com/cfe-commits@lists.llvm.org/msg65085.html),
performance regression will happen.

Wei.





Repository:
   rL LLVM

https://reviews.llvm.org/D36562




___
llvm-commits mailing list
llvm-comm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits


--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r312447 - [CodeGen] Treat all vector fields as mayalias

2017-09-03 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Sun Sep  3 10:18:25 2017
New Revision: 312447

URL: http://llvm.org/viewvc/llvm-project?rev=312447&view=rev
Log:
[CodeGen] Treat all vector fields as mayalias

Because it is common to treat vector types as an array of their elements, or
even some other type that's not the element type, and thus index into them, we
can't use struct-path TBAA for these accesses. Even though we already treat all
vector types as equivalent to 'char', we were using field-offset information
for them with TBAA, and this renders undefined the intra-value indexing we
intend to allow. Note that, although 'char' is universally aliasing, with path
TBAA, we can still differentiate between access to s.a and s.b in
  struct { char a, b; } s;. We can't use this capability as-is for vector types.

Fixes PR33967.

Added:
cfe/trunk/test/CodeGen/tbaa-vec.cpp
Modified:
cfe/trunk/lib/CodeGen/CGExpr.cpp

Modified: cfe/trunk/lib/CodeGen/CGExpr.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGExpr.cpp?rev=312447&r1=312446&r2=312447&view=diff
==
--- cfe/trunk/lib/CodeGen/CGExpr.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGExpr.cpp Sun Sep  3 10:18:25 2017
@@ -3665,8 +3665,9 @@ LValue CodeGenFunction::EmitLValueForFie
 getFieldAlignmentSource(BaseInfo.getAlignmentSource());
   LValueBaseInfo FieldBaseInfo(fieldAlignSource, BaseInfo.getMayAlias());
 
+  QualType type = field->getType();
   const RecordDecl *rec = field->getParent();
-  if (rec->isUnion() || rec->hasAttr())
+  if (rec->isUnion() || rec->hasAttr() || type->isVectorType())
 FieldBaseInfo.setMayAlias(true);
   bool mayAlias = FieldBaseInfo.getMayAlias();
 
@@ -3691,7 +3692,6 @@ LValue CodeGenFunction::EmitLValueForFie
 return LValue::MakeBitfield(Addr, Info, fieldType, FieldBaseInfo);
   }
 
-  QualType type = field->getType();
   Address addr = base.getAddress();
   unsigned cvr = base.getVRQualifiers();
   bool TBAAPath = CGM.getCodeGenOpts().StructPathTBAA;

Added: cfe/trunk/test/CodeGen/tbaa-vec.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/tbaa-vec.cpp?rev=312447&view=auto
==
--- cfe/trunk/test/CodeGen/tbaa-vec.cpp (added)
+++ cfe/trunk/test/CodeGen/tbaa-vec.cpp Sun Sep  3 10:18:25 2017
@@ -0,0 +1,20 @@
+// RUN: %clang_cc1 -triple x86_64-apple-darwin -O1 -disable-llvm-passes %s 
-emit-llvm -o - | FileCheck %s
+// Test TBAA metadata generated by front-end (vector types are always treated 
as mayalias).
+
+typedef float __m128 __attribute__ ((__vector_size__ (16)));
+
+struct A {
+  __m128 a, b;
+};
+
+void foo(A *a, __m128 v) {
+  // CHECK-LABEL: define void @_Z3fooP1ADv4_f
+  a->a = v;
+  // CHECK: store <4 x float> %v, <4 x float>* %{{.*}}, align 16, !tbaa 
[[TAG_char:!.*]]
+  // CHECK: store <4 x float> %{{.*}}, <4 x float>* %{{.*}}, align 16, !tbaa 
[[TAG_char]]
+}
+
+// CHECK: [[TYPE_char:!.*]] = !{!"omnipotent char", [[TAG_cxx_tbaa:!.*]],
+// CHECK: [[TAG_cxx_tbaa]] = !{!"Simple C++ TBAA"}
+// CHECK: [[TAG_char]] = !{[[TYPE_char]], [[TYPE_char]], i64 0}
+


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type

2017-09-03 Thread Hal Finkel via cfe-commits


On 09/03/2017 03:44 PM, Wei Mi wrote:

On Sat, Sep 2, 2017 at 6:04 PM, Hal Finkel  wrote:

On 08/22/2017 10:56 PM, Wei Mi via llvm-commits wrote:

On Tue, Aug 22, 2017 at 7:03 PM, Xinliang David Li 
wrote:


On Tue, Aug 22, 2017 at 6:37 PM, Chandler Carruth via Phabricator
 wrote:

chandlerc added a comment.

I'm really not a fan of the degree of complexity and subtlety that this
introduces into the frontend, all to allow particular backend
optimizations.

I feel like this is Clang working around a fundamental deficiency in
LLVM
and we should instead find a way to fix this in LLVM itself.

As has been pointed out before, user code can synthesize large integers
that small bit sequences are extracted from, and Clang and LLVM should
handle those just as well as actual bitfields.

Can we see how far we can push the LLVM side before we add complexity to
Clang here? I understand that there remain challenges to LLVM's stuff,
but I
don't think those challenges make *all* of the LLVM improvements off the
table, I don't think we've exhausted all ways of improving the LLVM
changes
being proposed, and I think we should still land all of those and
re-evaluate how important these issues are when all of that is in place.


The main challenge of doing  this in LLVM is that inter-procedural
analysis
(and possibly cross module) is needed (for store forwarding issues).

Wei, perhaps you can provide concrete test case to illustrate the issue
so
that reviewers have a good understanding.

David

Here is a runable testcase:
 1.cc 
class A {
public:
unsigned long f1:2;
unsigned long f2:6;
unsigned long f3:8;
unsigned long f4:4;
};
A a;
unsigned long b;
unsigned long N = 10;

__attribute__((noinline))
void foo() {
a.f3 = 3;
}

__attribute__((noinline))
void goo() {
b = a.f3;
}

int main() {
unsigned long i;
for (i = 0; i < N; i++) {
  foo();
  goo();
}
}

Now trunk takes about twice running time compared with trunk + this
patch. That is because trunk shrinks the store of a.f3 in foo (Done by
DagCombiner) but not shrink the load of a.f3 in goo, so store
forwarding will be blocked.


I can confirm that this is true on Haswell and also on an POWER8.
Interestingly, on a POWER7, the performance is the same either way (on the
good side). I ran the test case as presented and where I replaced f3 with a
non-bitfield unsigned char member. Thinking that the POWER7 result might be
because it's big-Endian, I flipped the order of the fields, and found that
the version where f3 is not a bitfield is faster than otherwise, but only by
12.5%.

Why, in this case, don't we shrink the load? It seems like we should (and it
seems like a straightforward case).

Thanks again,
Hal


Hal, thanks for trying the test.

Yes, it is straightforward to shrink the load in the test. I can
change the testcase a little to show why it is sometime difficult to
shrink the load:

class A {
public:
   unsigned long f1:16;
   unsigned long f2:16;
   unsigned long f3:16;
   unsigned long f4:8;
};
A a;
bool b;
unsigned long N = 10;

__attribute__((noinline))
void foo() {
   a.f4 = 3;
}

__attribute__((noinline))
void goo() {
   b = (a.f4 == 0 && a.f3 == (unsigned short)-1);
}

int main() {
   unsigned long i;
   for (i = 0; i < N; i++) {
 foo();
 goo();
   }
}

For the load a.f4 in goo, it is diffcult to motivate its shrink after
instcombine because the comparison with a.f3 and the comparison with
a.f4 are merged:

define void @_Z3goov() local_unnamed_addr #0 {
   %1 = load i64, i64* bitcast (%class.A* @a to i64*), align 8
   %2 = and i64 %1, 0xff
   %3 = icmp eq i64 %2, 0x
   %4 = zext i1 %3 to i8
   store i8 %4, i8* @b, align 1, !tbaa !2
   ret void
}


Exactly. But this behavior is desirable, locally. There's no good answer 
here: We either generate extra load traffic here (because we need to 
load the fields separately), or we widen the store (which generates 
extra load traffic there). Do you know, in terms of performance, which 
is better in this case (i.e., is it better to widen the store or split 
the load)?


 -Hal



Thanks,
Wei.


The testcases shows the potential problem of store shrinking. Before
we decide to do store shrinking, we need to know all the related loads
will be shrunk,  and that requires IPA analysis. Otherwise, when load
shrinking was blocked for some difficult case (Like the instcombine
case described in
https://www.mail-archive.com/cfe-commits@lists.llvm.org/msg65085.html),
performance regression will happen.

Wei.




Repository:
rL LLVM

https://reviews.llvm.org/D36562




___
llvm-commits mailing list
llvm-comm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits


--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National L

Re: [PATCH] D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type

2017-09-03 Thread Hal Finkel via cfe-commits


On 09/03/2017 10:38 PM, Xinliang David Li wrote:
Store forwarding stall cost is usually much higher compared with a 
load hitting L1 cache. For instance, on Haswell,  the latter is ~4 
cycles, while the store forwarding stalls cost about 10 cycles more 
than a successful store forwarding, which is roughly 15 cycles. In 
some scenarios, the store forwarding stalls can be as high as 50 
cycles. See Agner's documentation.


I understand. As I understand it, there are two potential ways to fix 
this problem:


 1. You can make the store wider (to match the size of the wide load, 
thus permitting forwarding).
 2. You can make the load smaller (to match the size of the small 
store, thus permitting forwarding).


At least in this benchmark, which is a better solution?

Thanks again,
Hal



In other words, the optimizer needs to be taught to avoid defeating 
 the HW pipeline feature as much as possible.


David

On Sun, Sep 3, 2017 at 6:32 PM, Hal Finkel > wrote:



On 09/03/2017 03:44 PM, Wei Mi wrote:

On Sat, Sep 2, 2017 at 6:04 PM, Hal Finkel mailto:hfin...@anl.gov>> wrote:

On 08/22/2017 10:56 PM, Wei Mi via llvm-commits wrote:

On Tue, Aug 22, 2017 at 7:03 PM, Xinliang David Li
mailto:davi...@google.com>>
wrote:


On Tue, Aug 22, 2017 at 6:37 PM, Chandler Carruth
via Phabricator
mailto:revi...@reviews.llvm.org>> wrote:

chandlerc added a comment.

I'm really not a fan of the degree of
complexity and subtlety that this
introduces into the frontend, all to allow
particular backend
optimizations.

I feel like this is Clang working around a
fundamental deficiency in
LLVM
and we should instead find a way to fix this
in LLVM itself.

As has been pointed out before, user code can
synthesize large integers
that small bit sequences are extracted from,
and Clang and LLVM should
handle those just as well as actual bitfields.

Can we see how far we can push the LLVM side
before we add complexity to
Clang here? I understand that there remain
challenges to LLVM's stuff,
but I
don't think those challenges make *all* of the
LLVM improvements off the
table, I don't think we've exhausted all ways
of improving the LLVM
changes
being proposed, and I think we should still
land all of those and
re-evaluate how important these issues are
when all of that is in place.


The main challenge of doing  this in LLVM is that
inter-procedural
analysis
(and possibly cross module) is needed (for store
forwarding issues).

Wei, perhaps you can provide concrete test case to
illustrate the issue
so
that reviewers have a good understanding.

David

Here is a runable testcase:
 1.cc 
class A {
public:
unsigned long f1:2;
unsigned long f2:6;
unsigned long f3:8;
unsigned long f4:4;
};
A a;
unsigned long b;
unsigned long N = 10;

__attribute__((noinline))
void foo() {
a.f3 = 3;
}

__attribute__((noinline))
void goo() {
b = a.f3;
}

int main() {
unsigned long i;
for (i = 0; i < N; i++) {
  foo();
  goo();
}
}

Now trunk takes about twice running time compared with
trunk + this
patch. That is because trunk shrinks the store of a.f3
in foo (Done by
DagCombiner) but not shrink the load of a.f3 in goo,
so store
forwarding will be blocked.


I can confirm that this is true on Haswel

Re: [PATCH] D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type

2017-09-03 Thread Hal Finkel via cfe-commits


On 09/03/2017 11:06 PM, Xinliang David Li wrote:

I think we can think this in another way.

For modern CPU architectures which supports store forwarding with 
store queues, it is generally not "safe" to blindly do local 
optimizations to widen the load/stores


Why not widen stores? Generally the problem with store forwarding is 
where the load is wider than the store (plus alignment issues).


without sophisticated inter-procedural analysis. Doing so will run the 
risk of greatly reduce performance of some programs. Keep the 
naturally aligned load/store using its natural type is safer.


Does it make sense?


It makes sense. I could, however, say the same thing about inlining. We 
need to make inlining decisions locally, but they have global impact. 
Nevertheless, we need to make inlining decisions, and there's no 
practical way to make them in a truly non-local way.


We also don't pessimize common cases to improve performance in rare 
cases. In the common case, reducing pressure on the memory units, and 
reducing the critical path, seem likely to me to be optimal. If that's 
not true, or doing otherwise has negligible cost (but can prevent rare 
downsides), we should certainly consider those options.


And none of this answers the question of whether it's better to have the 
store wider or the load split and narrow.


Thanks again,
Hal



David



On Sun, Sep 3, 2017 at 8:55 PM, Hal Finkel > wrote:



On 09/03/2017 10:38 PM, Xinliang David Li wrote:

Store forwarding stall cost is usually much higher compared with
a load hitting L1 cache. For instance, on Haswell,  the latter is
~4 cycles, while the store forwarding stalls cost about 10 cycles
more than a successful store forwarding, which is roughly 15
cycles. In some scenarios, the store forwarding stalls can be as
high as 50 cycles. See Agner's documentation.


I understand. As I understand it, there are two potential ways to
fix this problem:

 1. You can make the store wider (to match the size of the wide
load, thus permitting forwarding).
 2. You can make the load smaller (to match the size of the small
store, thus permitting forwarding).

At least in this benchmark, which is a better solution?

Thanks again,
Hal




In other words, the optimizer needs to be taught to avoid
defeating  the HW pipeline feature as much as possible.

David

On Sun, Sep 3, 2017 at 6:32 PM, Hal Finkel mailto:hfin...@anl.gov>> wrote:


On 09/03/2017 03:44 PM, Wei Mi wrote:

On Sat, Sep 2, 2017 at 6:04 PM, Hal Finkel
mailto:hfin...@anl.gov>> wrote:

On 08/22/2017 10:56 PM, Wei Mi via llvm-commits wrote:

On Tue, Aug 22, 2017 at 7:03 PM, Xinliang David
Li mailto:davi...@google.com>>
wrote:


On Tue, Aug 22, 2017 at 6:37 PM, Chandler
Carruth via Phabricator
mailto:revi...@reviews.llvm.org>> wrote:

chandlerc added a comment.

I'm really not a fan of the degree of
complexity and subtlety that this
introduces into the frontend, all to
allow particular backend
optimizations.

I feel like this is Clang working around
a fundamental deficiency in
LLVM
and we should instead find a way to fix
this in LLVM itself.

As has been pointed out before, user code
can synthesize large integers
that small bit sequences are extracted
from, and Clang and LLVM should
handle those just as well as actual
bitfields.

Can we see how far we can push the LLVM
side before we add complexity to
Clang here? I understand that there
remain challenges to LLVM's stuff,
but I
don't think those challenges make *all*
of the LLVM improvements off the
table, I don't think we've exhausted all
ways of improving the LLVM
changes
being proposed, and I think we should
still land all of those and
re-evaluate how important these issues
are when all of that is in place.


The main challenge of doing  this in LLVM is
that inter-procedural
  

Re: [PATCH] D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type

2017-09-03 Thread Hal Finkel via cfe-commits


On 09/04/2017 12:12 AM, Xinliang David Li wrote:



On Sun, Sep 3, 2017 at 9:23 PM, Hal Finkel > wrote:



On 09/03/2017 11:06 PM, Xinliang David Li wrote:

I think we can think this in another way.

For modern CPU architectures which supports store forwarding with
store queues, it is generally not "safe" to blindly do local
optimizations to widen the load/stores


Why not widen stores? Generally the problem with store forwarding
is where the load is wider than the store (plus alignment issues).


True, but probably with some caveats which are target dependent.  
Store widening also requires additional bit operations (and possibly 
addition load), so the it is tricky to model the the overall benefit.



without sophisticated inter-procedural analysis. Doing so will
run the risk of greatly reduce performance of some programs. Keep
the naturally aligned load/store using its natural type is safer.

Does it make sense?


It makes sense. I could, however, say the same thing about
inlining. We need to make inlining decisions locally, but they
have global impact. Nevertheless, we need to make inlining
decisions, and there's no practical way to make them in a truly
non-local way.


Speaking of inlining, we are actually thinking of ways to make the 
decisions more globally optimal, but that is off topic.


Neat.



We also don't pessimize common cases to improve performance in
rare cases. In the common case, reducing pressure on the memory
units, and reducing the critical path, seem likely to me to be
optimal. If that's not true, or doing otherwise has negligible
cost (but can prevent rare downsides), we should certainly
consider those options.


Since we don't do load widening for non-bitfield cases (but the only 
the very limited case of naturally aligned bitfields) so it is hard to 
say we pessimize common cases for rare cases:


1) the upside doing widening such access is not high from experience 
with other compiler (which does not do so)
2) there is obvious downside of introducing additional extract 
instructions which degrades performance
3) there is obvious downside of severely degrading performance when 
store forwarding is blocked.


I suspect that it's relatively rare to hit these store-to-load 
forwarding issues compared to the number of times the systems stores or 
loads to bitfields. In any case, I did some experiments on my Haswell 
system and found that the load from Wei's benchmark which is split into 
two loads, compared to the single load version, is 0.012% slower. I, 
indeed, won't worry about that too much. On my P8, I couldn't measure a 
difference. Obviously, this does somewhat miss the point, as the real 
cost in this kind of thing comes in stressing the memory units in code 
with a lot going on, not in isolated cases.


Nevertheless, I think that you've convinced me that this is a least-bad 
solution. I'll want a flag preserving the old behavior. Something like 
-fwiden-bitfield-accesses (modulo bikeshedding).






And none of this answers the question of whether it's better to
have the store wider or the load split and narrow.



It seems safer to do store widening more aggressively to avoid store 
forwarding stall issue, but doing this aggressively may also mean 
other runtime overhead introduced (extra load, data merge etc).


Yes. Wei confirmed that this is slower.

Thanks again,
Hal



Thanks,

David


Thanks again,
Hal




David



On Sun, Sep 3, 2017 at 8:55 PM, Hal Finkel mailto:hfin...@anl.gov>> wrote:


On 09/03/2017 10:38 PM, Xinliang David Li wrote:

Store forwarding stall cost is usually much higher compared
with a load hitting L1 cache. For instance, on Haswell,  the
latter is ~4 cycles, while the store forwarding stalls cost
about 10 cycles more than a successful store forwarding,
which is roughly 15 cycles. In some scenarios, the store
forwarding stalls can be as high as 50 cycles. See Agner's
documentation.


I understand. As I understand it, there are two potential
ways to fix this problem:

 1. You can make the store wider (to match the size of the
wide load, thus permitting forwarding).
 2. You can make the load smaller (to match the size of the
small store, thus permitting forwarding).

At least in this benchmark, which is a better solution?

Thanks again,
Hal




In other words, the optimizer needs to be taught to avoid
defeating  the HW pipeline feature as much as possible.

David

On Sun, Sep 3, 2017 at 6:32 PM, Hal Finkel mailto:hfin...@anl.gov>> wrote:


On 09/03/2017 03:44 PM, Wei Mi wrote:

On Sat, Sep 2, 2017 at 6:04 PM, Hal Finkel
mailto:hfin...@anl.gov>> wrote:

On 08/22/2017 10:56 PM, Wei Mi via

Re: [PATCH] D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type

2017-09-03 Thread Hal Finkel via cfe-commits


On 09/03/2017 11:22 PM, Wei Mi wrote:

On Sun, Sep 3, 2017 at 8:55 PM, Hal Finkel  wrote:

On 09/03/2017 10:38 PM, Xinliang David Li wrote:

Store forwarding stall cost is usually much higher compared with a load
hitting L1 cache. For instance, on Haswell,  the latter is ~4 cycles, while
the store forwarding stalls cost about 10 cycles more than a successful
store forwarding, which is roughly 15 cycles. In some scenarios, the store
forwarding stalls can be as high as 50 cycles. See Agner's documentation.


I understand. As I understand it, there are two potential ways to fix this
problem:

  1. You can make the store wider (to match the size of the wide load, thus
permitting forwarding).
  2. You can make the load smaller (to match the size of the small store,
thus permitting forwarding).

At least in this benchmark, which is a better solution?

Thanks again,
Hal


For this benchmark, smaller load is better. On my sandybridge desktop,
wider store is 3.77s, smaller load is 3.45s. If store forwarding is
blocked, it costs 6.9s.

However, we don't have good way to narrow the load matching the store
shrinking because the field information has been lost.  For the IR
below:

define void @_Z3goov() local_unnamed_addr #0 {
   %1 = load i64, i64* bitcast (%class.A* @a to i64*), align 8
   %2 = and i64 %1, 0xff
   %3 = icmp eq i64 %2, 0x
   %4 = zext i1 %3 to i8
   store i8 %4, i8* @b, align 1, !tbaa !2
   ret void
}

We know the 24bits range from bit 32 to bit 56 of @a are accessed, but
we don't know whether the 24bits ranges contain 8bits + 16bits
bitfields, or 16bits + 8bits bitfields, or 8bit + 8bit + 8bit
bitfields. Once the load shrinking done locally is inconsistent with
store shrinking, we will have store forwarding issue and will suffer
from huge regression.


Understood. This is a convincing argument. The cost of splitting the 
loads seems not high, at least not in isolation.


Thanks again,
Hal



Thanks,
Wei.





In other words, the optimizer needs to be taught to avoid defeating  the HW
pipeline feature as much as possible.

David

On Sun, Sep 3, 2017 at 6:32 PM, Hal Finkel  wrote:


On 09/03/2017 03:44 PM, Wei Mi wrote:

On Sat, Sep 2, 2017 at 6:04 PM, Hal Finkel  wrote:

On 08/22/2017 10:56 PM, Wei Mi via llvm-commits wrote:

On Tue, Aug 22, 2017 at 7:03 PM, Xinliang David Li 
wrote:


On Tue, Aug 22, 2017 at 6:37 PM, Chandler Carruth via Phabricator
 wrote:

chandlerc added a comment.

I'm really not a fan of the degree of complexity and subtlety that
this
introduces into the frontend, all to allow particular backend
optimizations.

I feel like this is Clang working around a fundamental deficiency in
LLVM
and we should instead find a way to fix this in LLVM itself.

As has been pointed out before, user code can synthesize large
integers
that small bit sequences are extracted from, and Clang and LLVM
should
handle those just as well as actual bitfields.

Can we see how far we can push the LLVM side before we add complexity
to
Clang here? I understand that there remain challenges to LLVM's
stuff,
but I
don't think those challenges make *all* of the LLVM improvements off
the
table, I don't think we've exhausted all ways of improving the LLVM
changes
being proposed, and I think we should still land all of those and
re-evaluate how important these issues are when all of that is in
place.


The main challenge of doing  this in LLVM is that inter-procedural
analysis
(and possibly cross module) is needed (for store forwarding issues).

Wei, perhaps you can provide concrete test case to illustrate the
issue
so
that reviewers have a good understanding.

David

Here is a runable testcase:
 1.cc 
class A {
public:
 unsigned long f1:2;
 unsigned long f2:6;
 unsigned long f3:8;
 unsigned long f4:4;
};
A a;
unsigned long b;
unsigned long N = 10;

__attribute__((noinline))
void foo() {
 a.f3 = 3;
}

__attribute__((noinline))
void goo() {
 b = a.f3;
}

int main() {
 unsigned long i;
 for (i = 0; i < N; i++) {
   foo();
   goo();
 }
}

Now trunk takes about twice running time compared with trunk + this
patch. That is because trunk shrinks the store of a.f3 in foo (Done by
DagCombiner) but not shrink the load of a.f3 in goo, so store
forwarding will be blocked.


I can confirm that this is true on Haswell and also on an POWER8.
Interestingly, on a POWER7, the performance is the same either way (on
the
good side). I ran the test case as presented and where I replaced f3
with a
non-bitfield unsigned char member. Thinking that the POWER7 result might
be
because it's big-Endian, I flipped the order of the fields, and found
that
the version where f3 is not a bitfield is faster than otherwise, but
only by
12.5%.

Why, in this case, don't we shrink the load? It seems like we should
(and it
seems like a straightforward case).

Thanks again,
Hal


H

Re: [PATCH] D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type

2017-09-04 Thread Hal Finkel via cfe-commits


On 09/04/2017 03:57 AM, Chandler Carruth wrote:
On Sun, Sep 3, 2017 at 10:41 PM Hal Finkel via llvm-commits 
mailto:llvm-comm...@lists.llvm.org>> wrote:


Nevertheless, I think that you've convinced me that this is a
least-bad solution. I'll want a flag preserving the old behavior.
Something like -fwiden-bitfield-accesses (modulo bikeshedding).

Several different approaches have been discussed in this thread, I'm 
not sure what you mean about "least-bad solution"...


I remain completely unconvinced that we should change the default 
behavior. At most, I'm not strongly opposed to adding an attribute 
that indicates "please try to use narrow loads for this bitfield 
member" and is an error if applied to a misaligned or non-byte-sized 
bitfield member.


I like this solution too (modulo the fact that I dislike all of these 
solutions). Restricting this only to affecting the loads, and not the 
stores, is also an interesting thought. The attribute could also be on 
the access itself (which, at least from the theoretical perspective, I'd 
prefer).


But I remain strongly opposed to changing the default behavior. We 
have one or two cases that regress and are easily addressed by source 
changes (either to not use bitfields or to use an attribute). I don't 
think that is cause to change the lowering Clang has used for years 
and potentially regress many other use cases.


I have mixed feelings about all of the potential fixes here. To walk 
through my thoughts on this:


 1. I don't like any solutions that require changes affecting source 
level semantics. Something that the compiler does automatically is fine, 
as is an attribute.


 2. Next, regarding default behavior, we have a trade off:

   A. Breaking apart the accesses, as proposed here, falls into the 
category of "generally, it makes everything a little bit slower." But 
it's worse than that because it comes on a spectrum. I can easily 
construct variants of the provided benchmark which make the separate 
loads have a bad performance penalty. For example:


$ cat ~/tmp/3m.cpp
class A {
public:
#ifdef BF
  unsigned long f7:8;
  unsigned long f6:8;
  unsigned long f5:8;
  unsigned long f4:8;
  unsigned long f3:8;
  unsigned long f2:8;
  unsigned long f1:8;
  unsigned long f0:8;
#else
  unsigned char f7;
  unsigned char f6;
  unsigned char f5;
  unsigned char f4;
  unsigned char f3;
  unsigned char f2;
  unsigned char f1;
  unsigned char f0;
#endif
};
A a;
bool b;
unsigned long N = 10;

__attribute__((noinline))
void foo() {
  a.f4 = 3;
}

__attribute__((noinline))
void goo() {
  b = (a.f0 == 0 && a.f1 == (unsigned char)-1 &&
   a.f2 == 0 && a.f3 == 0 && a.f4 == 0 && a.f5 == 0 &&
   a.f6 == 0 && a.f7 == (unsigned char)-1);
}

int main() {
  unsigned long i;
  foo();
  for (i = 0; i < N; i++) {
goo();
  }
}

Run this and you'll find that our current scheme, on Haswell, beats 
the separate-loads scheme by nearly 60% (e.g., 2.77s for separate loads 
vs. 1.75s for the current bitfield lowering).


So, how common is it to have a bitfield with a large number of 
fields that could be loaded separately (i.e. have the right size and 
alignment) and have code that loads a large number of them like this 
(i.e. without other work to hide the relative costs)? I have no idea, 
but regardless, there is definitely a high-cost end to this spectrum.


  B. However, our current scheme can trigger expensive architectural 
hazards. Moreover, there's no local after-the-fact analysis that can fix 
this consistently. I think that Wei has convincingly demonstrated both 
of these things. How common is this? I have no idea. More specifically, 
how do the relative costs of hitting these hazards compare to the costs 
of the increased number of loads under the proposed scheme? I have no 
idea (and this certainly has a workload-dependent answer).


 C. This situation seems unlikely to change in the future: it seems 
like a physics problem. The data surrounding the narrower store is 
simply not in the pipeline to be matched with the wider load. Keeping 
the data in the pipeline would have extra costs, perhaps significant. 
I'm guessing the basic structure of this hazard is here to stay.


 D. In the long run, could this be a PGO-driven decision? Yes, and this 
seems optimal. It would depend on infrastructure enhancements, and 
critically, the hardware having the right performance counter(s) to sample.


So, as to the question of what to do right now, I have two thoughts: 1) 
All of the solutions will be bad for someone. 2) Which is a least-bad 
default depends on the workload. Your colleagues seem to be asserting 
that, for Google, the separate loads are least bad (and, FWIW, you're 
more likely to have hot code like this than I am). This is definitely an 
issue on which reasonable people can disagree. In the end, I'll 
begrudgingly agree that this should be an empirical decision. We should 
have some flag/pragma/attribute/etc. to allow selection

Re: D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-09-22 Thread Hal Finkel via cfe-commits


On 09/22/2017 01:09 PM, Kaylor, Andrew wrote:

The reason I introduced this patch to begin with is that there are 
circumstances under which the optimizer will eliminate loads from addresses 
that were generated based on the null pointer arithmetic (because clang 
previously emitted a null-based GEP and still will in the Firefox case because 
it's using subtraction).  It would seem that the Firefox case won't ever 
dereference the pointer it is creating this way, so it should be safe from the 
optimization I was seeing.

On the other hand, what the warning says is true, right?  I believe clang will produce an 
inbounds GEP in the Firefox case and the LLVM language reference says, "The only in 
bounds address for a null pointer in the default address-space is the null pointer 
itself."  So it's entirely possible that some optimization will interpret the result 
of the GEP generated to represent '(((char*)0)-1)' as a poison value.

-Andy


I agree. The warning seems good here. As I recall, doing pointer 
arithmetic on the null pointer is UB (even if we never dereference it).


For convenience, it looks like this:


pointer arithmetic on a null pointer has undefined behavior if the offset is 
nonzero [-Werror,-Wnull-pointer-arithmetic]
  return net_FindCharInSet(str, NET_MAX_ADDRESS, set);
^~~
  
/data/jenkins/workspace/firefox-clang-last/obj-x86_64-pc-linux-gnu/dist/include/nsURLHelper.h:224:36:
 note: expanded from macro 'NET_MAX_ADDRESS'
  #define NET_MAX_ADDRESS (((char*)0)-1)


 -Hal



-Original Message-
From: Sylvestre Ledru via Phabricator [mailto:revi...@reviews.llvm.org]
Sent: Friday, September 22, 2017 9:02 AM
To: Kaylor, Andrew ; rjmcc...@gmail.com; 
rich...@metafoo.co.uk; efrie...@codeaurora.org
Cc: sylves...@debian.org; Ivchenko, Alexander ; 
hfin...@anl.gov; mcros...@codeaurora.org; david.majne...@gmail.com; 
cfe-commits@lists.llvm.org
Subject: [PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom 
used by glibc

sylvestre.ledru added a comment.

For the record, Firefox was using this trick. This patch is breaking a ci build 
(clang trunk + warning as errors) More information here: 
https://bugzilla.mozilla.org/show_bug.cgi?id=1402362


Repository:
   rL LLVM

https://reviews.llvm.org/D37042





--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc

2017-09-22 Thread Hal Finkel via cfe-commits


On 09/22/2017 01:45 PM, Sylvestre Ledru wrote:




On 22/09/2017 20:27, Hal Finkel wrote:



On 09/22/2017 01:09 PM, Kaylor, Andrew wrote:

The reason I introduced this patch to begin with is that there are 
circumstances under which the optimizer will eliminate loads from addresses 
that were generated based on the null pointer arithmetic (because clang 
previously emitted a null-based GEP and still will in the Firefox case because 
it's using subtraction).  It would seem that the Firefox case won't ever 
dereference the pointer it is creating this way, so it should be safe from the 
optimization I was seeing.

On the other hand, what the warning says is true, right?  I believe clang will produce an 
inbounds GEP in the Firefox case and the LLVM language reference says, "The only in 
bounds address for a null pointer in the default address-space is the null pointer 
itself."  So it's entirely possible that some optimization will interpret the result 
of the GEP generated to represent '(((char*)0)-1)' as a poison value.

-Andy


I agree. The warning seems good here. As I recall, doing pointer 
arithmetic on the null pointer is UB (even if we never dereference it).


For convenience, it looks like this:


pointer arithmetic on a null pointer has undefined behavior if the offset is 
nonzero [-Werror,-Wnull-pointer-arithmetic]
  return net_FindCharInSet(str, NET_MAX_ADDRESS, set);
^~~
  
/data/jenkins/workspace/firefox-clang-last/obj-x86_64-pc-linux-gnu/dist/include/nsURLHelper.h:224:36:
 note: expanded from macro 'NET_MAX_ADDRESS'
  #define NET_MAX_ADDRESS (((char*)0)-1)



To be clear, I wasn't arguing!
I was just giving feedback about this new warning.


Sounds good :-)



By the way, maybe we should add that to the release notes? 
https://clang.llvm.org/docs/ReleaseNotes.html


I agree. That would be a good idea.

 -Hal



Sylvestre



--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D48808: [CodeGen] Emit parallel_loop_access for each loop in the loop stack.

2018-07-02 Thread Hal Finkel via cfe-commits

On 07/02/2018 12:27 PM, Tyler Nowicki wrote:
> Hi Michael, Hal,
>
> Sorry it has been a while since I looked at this. My memory is a
> little fuzzy. The intent of 'assume_safety' is to tell LAA to
> skip dependency checking on loads and stores so the vectorizer doesn't
> stop as soon as it sees both in a loop. At the time 'assume_safety'
> was implemented the vectorizer was limited to inner-loops. I am not
> up-to-date but it seems to have the ability to perform some
> vectorization of non-inner loop instructions? 

The infrastructure to support outer loops is under active development.

>
> If we can vectorize non-inner loop instructions then what behavior
> would make the most sense: 'assume_safety' applies to the same loop
> scope(s) as the other loop pragmas, or it applies to all nested loops?
>
> My opinion is that for consistency 'assume_safety' and similar options
> apply to the same scope(s) as 'vectorize(enable)'. But I am open to
> alternatives if others see it differently.

I agree. Should apply to the entire single loop under the pragma. This
proposed change is consistent with that.

 -Hal

>
> Tyler Nowicki
>
> On Mon, Jul 2, 2018 at 10:44 AM Michael Kruse via Phabricator
> mailto:revi...@reviews.llvm.org>> wrote:
>
> Meinersbur added a comment.
>
> In https://reviews.llvm.org/D48808#1149534, @ABataev wrote:
>
> > I don't think that this is the intended behavior of the `#pragma
> clang loop`. it is better to ask the author of this pragma is this
> correct or not.
>
>
> I understand it as the intended behavior of the `assume_safety`
> option (also used for `#pragma omp simd`).
>
> @tyler.nowicki What is the intended behaviour?
>
>
> Repository:
>   rC Clang
>
> https://reviews.llvm.org/D48808
>
>
>

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D24481: make “#pragma STDC FP_CONTRACT” on by default

2016-09-23 Thread Hal Finkel via cfe-commits
We currently have logic in the test suite that sets -ffp-contract=off on 
PowerPC (because the default for GCC and other compilers on PowerPC/Linux 
systems is essentially -ffp-contract=fast). We might just want to do this now 
for all platforms.

 -Hal

- Original Message -
> From: "Steve Canon" 
> To: reviews+d24481+public+c0b8d50a92298...@reviews.llvm.org
> Cc: "a skolnik" , clatt...@apple.com, hfin...@anl.gov, 
> "yaxun liu" ,
> cfe-commits@lists.llvm.org
> Sent: Friday, September 23, 2016 4:47:32 PM
> Subject: Re: [PATCH] D24481: make “#pragma STDC FP_CONTRACT” on by default
> 
> Without digging into them yet, these are almost caused by
> overly-sensitive tests that are erroneously expecting bit-exact
> results.
> 
> - Steve
> 
> Sent from my iPhone
> 
> > On Sep 23, 2016, at 4:42 PM, Renato Golin 
> > wrote:
> > 
> > rengolin added a subscriber: rengolin.
> > rengolin added a comment.
> > 
> > Folks, this commit has broken both AArch64 test-suite buildbots:
> > 
> > http://lab.llvm.org:8011/builders/clang-cmake-aarch64-full/builds/3162
> > 
> > http://lab.llvm.org:8011/builders/clang-cmake-aarch64-quick/builds/10449
> > 
> > I have reverted in r282289, let me know if you need help testing on
> > AArch64.
> > 
> > 
> > Repository:
> >  rL LLVM
> > 
> > https://reviews.llvm.org/D24481
> > 
> > 
> > 
> 

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r281277 - [Sema] Fix PR30346: relax __builtin_object_size checks.

2016-09-24 Thread Hal Finkel via cfe-commits
- Original Message -
> From: "George Burgess IV" 
> To: "Hal Finkel" 
> Cc: "Richard Smith" , "Joerg Sonnenberger" 
> , "cfe-commits"
> 
> Sent: Monday, September 19, 2016 11:21:33 PM
> Subject: Re: r281277 - [Sema] Fix PR30346: relax __builtin_object_size checks.
> 
> 
> WFM; I'll put together a patch that only allows this under
> -fno-strict-aliasing.
> 
> 
> I'm entirely unfamiliar with struct-path-tbaa, so Hal, do you see a
> reason why struct-path-tbaa wouldn't play nicely with flexible
> arrays at the end of types? Glancing at it, I don't think it should
> cause problems, but a more authoritative answer would really be
> appreciated. :) If it might cause issues now or in the future, I'm
> happy to be conservative here if -fno-strict-path-tbaa is given,
> too.

We currently don't emit struct-path-tbaa for array members. We likely should, 
and we'll need to keep flexible array members in mind when we implement that 
extension. I don't think that the current representation has a way to represent 
an unbounded size (except for using ((size_t) -1), which might be as good as 
anything else).

 -Hal

> 
> On Tue, Sep 13, 2016 at 2:00 PM, Joerg Sonnenberger via cfe-commits <
> cfe-commits@lists.llvm.org > wrote:
> 
> 
> On Tue, Sep 13, 2016 at 12:51:52PM -0700, Richard Smith wrote:
> > On Tue, Sep 13, 2016 at 10:44 AM, Joerg Sonnenberger via
> > cfe-commits <
> > cfe-commits@lists.llvm.org > wrote:
> > 
> > > IMO this should be restricted to code that explicitly disables
> > > C/C++
> > > aliasing rules.
> > 
> > 
> > Do you mean -fno-strict-aliasing or -fno-struct-path-tbaa or
> > something else
> > here? (I think we're not doing anyone any favours by making
> > _FORTIFY_SOURCE
> > say that a pattern is OK in cases when LLVM will in fact optimize
> > on the
> > assumption that it's UB, but I don't recall how aggressive
> > -fstruct-path-tbaa is for trailing array members.)
> 
> The former immediately, the latter potentially as well. I can't think
> of
> many use cases for this kind of idiom that don't involve type
> prunning
> and socket code is notoriously bad in that regard by necessity.
> 
> 
> 
> Joerg
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> 
> 

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r282426 - CC1: Add -save-stats option

2016-09-26 Thread Hal Finkel via cfe-commits
Nice!

 -Hal

- Original Message -
> From: "Matthias Braun via cfe-commits" 
> To: cfe-commits@lists.llvm.org
> Sent: Monday, September 26, 2016 1:53:34 PM
> Subject: r282426 - CC1: Add -save-stats option
> 
> Author: matze
> Date: Mon Sep 26 13:53:34 2016
> New Revision: 282426
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=282426&view=rev
> Log:
> CC1: Add -save-stats option
> 
> This option behaves in a similar spirit as -save-temps and writes
> internal llvm statistics in json format to a file.
> 
> Differential Revision: https://reviews.llvm.org/D24820
> 
> Added:
> cfe/trunk/test/Driver/save-stats.c
> cfe/trunk/test/Frontend/stats-file.c
> Modified:
> cfe/trunk/docs/CommandGuide/clang.rst
> cfe/trunk/include/clang/Basic/DiagnosticFrontendKinds.td
> cfe/trunk/include/clang/Driver/CC1Options.td
> cfe/trunk/include/clang/Driver/Options.td
> cfe/trunk/include/clang/Frontend/FrontendOptions.h
> cfe/trunk/lib/Driver/Tools.cpp
> cfe/trunk/lib/Frontend/CompilerInstance.cpp
> cfe/trunk/lib/Frontend/CompilerInvocation.cpp
> cfe/trunk/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
> cfe/trunk/test/Misc/warning-flags.c
> 
> Modified: cfe/trunk/docs/CommandGuide/clang.rst
> URL:
> http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/CommandGuide/clang.rst?rev=282426&r1=282425&r2=282426&view=diff
> ==
> --- cfe/trunk/docs/CommandGuide/clang.rst (original)
> +++ cfe/trunk/docs/CommandGuide/clang.rst Mon Sep 26 13:53:34 2016
> @@ -408,6 +408,12 @@ Driver Options
>  
>Save intermediate compilation results.
>  
> +.. option:: -save-stats, -save-stats=cwd, -save-stats=obj
> +
> +  Save internal code generation (LLVM) statistics to a file in the
> current
> +  directory (:option:`-save-stats`/:option:`-save-stats=cwd`) or the
> directory
> +  of the output file (:option:`-save-state=obj`).
> +
>  .. option:: -integrated-as, -no-integrated-as
>  
>Used to enable and disable, respectively, the use of the
>integrated
> 
> Modified: cfe/trunk/include/clang/Basic/DiagnosticFrontendKinds.td
> URL:
> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticFrontendKinds.td?rev=282426&r1=282425&r2=282426&view=diff
> ==
> --- cfe/trunk/include/clang/Basic/DiagnosticFrontendKinds.td
> (original)
> +++ cfe/trunk/include/clang/Basic/DiagnosticFrontendKinds.td Mon Sep
> 26 13:53:34 2016
> @@ -107,6 +107,8 @@ def warn_fe_cc_print_header_failure : Wa
>  "unable to open CC_PRINT_HEADERS file: %0 (using stderr)">;
>  def warn_fe_cc_log_diagnostics_failure : Warning<
>  "unable to open CC_LOG_DIAGNOSTICS file: %0 (using stderr)">;
> +def warn_fe_unable_to_open_stats_file : Warning<
> +"unable to open statistics output file '%0': '%1'">;
>  def err_fe_no_pch_in_dir : Error<
>  "no suitable precompiled header file found in directory '%0'">;
>  def err_fe_action_not_available : Error<
> 
> Modified: cfe/trunk/include/clang/Driver/CC1Options.td
> URL:
> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/CC1Options.td?rev=282426&r1=282425&r2=282426&view=diff
> ==
> --- cfe/trunk/include/clang/Driver/CC1Options.td (original)
> +++ cfe/trunk/include/clang/Driver/CC1Options.td Mon Sep 26 13:53:34
> 2016
> @@ -509,6 +509,8 @@ def arcmt_migrate : Flag<["-"], "arcmt-m
>  
>  def print_stats : Flag<["-"], "print-stats">,
>HelpText<"Print performance metrics and statistics">;
> +def stats_file : Joined<["-"], "stats-file=">,
> +  HelpText<"Filename to write statistics to">;
>  def fdump_record_layouts : Flag<["-"], "fdump-record-layouts">,
>HelpText<"Dump record layout information">;
>  def fdump_record_layouts_simple : Flag<["-"],
>  "fdump-record-layouts-simple">,
> 
> Modified: cfe/trunk/include/clang/Driver/Options.td
> URL:
> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=282426&r1=282425&r2=282426&view=diff
> ==
> --- cfe/trunk/include/clang/Driver/Options.td (original)
> +++ cfe/trunk/include/clang/Driver/Options.td Mon Sep 26 13:53:34
> 2016
> @@ -1876,6 +1876,11 @@ def save_temps_EQ : Joined<["-", "--"],
>  def save_temps : Flag<["-", "--"], "save-temps">,
>  Flags<[DriverOption]>,
>Alias, AliasArgs<["cwd"]>,
>HelpText<"Save intermediate compilation results">;
> +def save_stats_EQ : Joined<["-", "--"], "save-stats=">,
> Flags<[DriverOption]>,
> +  HelpText<"Save llvm statistics.">;
> +def save_stats : Flag<["-", "--"], "save-stats">,
> Flags<[DriverOption]>,
> +  Alias, AliasArgs<["cwd"]>,
> +  HelpText<"Save llvm statistics.">;
>  def via_file_asm : Flag<["-", "--"], "via-file-asm">,
>  InternalDebugOpt,
>HelpText<"Write assembly to file for input to assemble jobs">;
>

Re: [PATCH] D18172: [CUDA][OpenMP] Add a generic offload action builder

2016-09-28 Thread Hal Finkel via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

A nice abstraction and cleanup. LGTM.



Comment at: lib/Driver/Driver.cpp:1625
@@ +1624,3 @@
+  // architecture. If we are in host-only mode we return 'success' so that
+  // the host use the CUDA offload kind.
+  if (auto *IA = dyn_cast(HostAction)) {

use -> uses


https://reviews.llvm.org/D18172



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D21840: [Driver][CUDA][OpenMP] Reimplement tool selection in the driver.

2016-09-28 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

The naming here is a bit hard to follow, we have 'dependent action', 
'dependency action', 'depending action', and I think they're all supposed to 
mean the same thing. Only 'dependent action' sounds right to me, can we use 
that universally (i.e. in all comments and names of functions and variables)?



Comment at: lib/Driver/Driver.cpp:2394
@@ +2393,3 @@
+Action *CurAction = *Inputs.begin();
+if (!CurAction->isCollapsingWithDependingActionLegal() && CanBeCollapsed)
+  return nullptr;

As a micro-optimization, check CanBeCollapsed first, then call the function:

  if (CanBeCollapsed && !CurAction->isCollapsingWithDependingActionLegal())



Comment at: lib/Driver/Driver.cpp:2408
@@ +2407,3 @@
+  if (!CurAction->isCollapsingWithDependingActionLegal() &&
+  CanBeCollapsed)
+return nullptr;

  if (CanBeCollapsed && !CurAction->isCollapsingWithDependingActionLegal())


Comment at: lib/Driver/Driver.cpp:2415
@@ +2414,3 @@
+CurAction = OA->getHostDependence();
+if (!CurAction->isCollapsingWithDependingActionLegal() &&
+CanBeCollapsed)

  if (CanBeCollapsed && !CurAction->isCollapsingWithDependingActionLegal())


Comment at: lib/Driver/Driver.cpp:2444
@@ +2443,3 @@
+  /// collapsed with it.
+  struct JobActionInfoTy final {
+/// The action this info refers to.

Putting "Ty" on the end of a type name seems unusual for our code base (we 
generally use that for typedefs or for variables that represent types of other 
entities). Just JobActionInfo should be fine.


Comment at: lib/Driver/Driver.cpp:2474
@@ +2473,3 @@
+  const Tool *
+  attemptCombineAssembleBackendCompile(ArrayRef ActionInfo,
+   const ActionList *&Inputs,

I don't think we need 'attempt' in the name here, just make this:

  combineAssembleBackendCompile


Comment at: lib/Driver/Driver.cpp:2507
@@ +2506,3 @@
+  const Tool *
+  attemptCombineAssembleBackend(ArrayRef ActionInfo,
+const ActionList *&Inputs,

We don't need 'attempt' in the name here either.


Comment at: lib/Driver/Driver.cpp:2540
@@ -2473,1 +2539,3 @@
   }
+  const Tool *attemptCombineBackendCompile(ArrayRef 
ActionInfo,
+   const ActionList *&Inputs,

  combineBackendCompile


Comment at: lib/Driver/Driver.cpp:2568
@@ +2567,3 @@
+  /// are appended to \a CollapsedOffloadAction.
+  void attemptCombineWithPreprocess(const Tool *T, const ActionList *&Inputs,
+ActionList &CollapsedOffloadAction) {

combineWithPreprocessor


Comment at: lib/Driver/Driver.cpp:2595
@@ +2594,3 @@
+
+  /// Check if a chain of action can be combined and return the tool that can
+  /// handle the combination of actions. The pointer to the current inputs \a

action -> actions


Comment at: lib/Driver/Driver.cpp:2632
@@ +2631,3 @@
+
+if (!T)
+  T = attemptCombineAssembleBackendCompile(ActionChain, Inputs,

I don't think the syntactic regularity here is helpful enough to justify this 
extra if. Just do:

  const Tool *T = combineAssembleBackendCompile(ActionChain, Inputs,
CollapsedOffloadAction);



https://reviews.llvm.org/D21840



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D21843: [Driver][OpenMP] Create tool chains for OpenMP offloading kind.

2016-09-28 Thread Hal Finkel via cfe-commits
hfinkel added inline comments.


Comment at: include/clang/Basic/DiagnosticDriverKinds.td:163
@@ +162,3 @@
+def err_drv_expecting_fopenmp_with_fopenmp_targets : Error<
+  "The option -fopenmp-targets must be used in conjunction with a -fopenmp 
option compatible with offloading.">;
+def warn_drv_omp_offload_target_duplicate : Warning<

This message does not tell the user how they might make their -fopenmp option 
"compatible with offloading." Please make sure the message does, or is has an 
associated hint message which does.



https://reviews.llvm.org/D21843



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D21845: [Driver][OpenMP] Add specialized action builder for OpenMP offloading actions.

2016-09-28 Thread Hal Finkel via cfe-commits
hfinkel added inline comments.


Comment at: lib/Driver/Driver.cpp:1836
@@ +1835,3 @@
+ActionBuilderReturnCode
+getDeviceDepences(OffloadAction::DeviceDependences &DA, phases::ID 
CurPhase,
+  phases::ID FinalPhase, PhasesTy &Phases) override {

Depences - Spelling?


Comment at: lib/Driver/Driver.cpp:1854
@@ +1853,3 @@
+
+// We passed the device action to a host dependence, so we don't need 
to
+// do anything else with them.

to a -> as a


Comment at: lib/Driver/Driver.cpp:1879
@@ +1878,3 @@
+  // When generating code for OpenMP we use the host compile phase result 
as
+  // dependence to the device compile phase so that it can learn what
+  // declaration should be emitted. However, this is not the only use for

as dependence -> as a dependence (or as the dependence)


Comment at: lib/Driver/Driver.cpp:1880
@@ +1879,3 @@
+  // dependence to the device compile phase so that it can learn what
+  // declaration should be emitted. However, this is not the only use for
+  // the host action, so we have prevent it from being collapsed.

declaration -> declarations


Comment at: lib/Driver/Driver.cpp:1881
@@ +1880,3 @@
+  // declaration should be emitted. However, this is not the only use for
+  // the host action, so we have prevent it from being collapsed.
+  if (isa(HostAction)) {

have prevent -> prevent


Comment at: lib/Driver/Driver.cpp:1918
@@ +1917,3 @@
+  // Get the OpenMP toolchains. If we don't get any, the action builder 
will
+  // know there is nothing to do related with OpenMP offloading.
+  auto OpenMPTCRange = C.getOffloadToolChains();

related with -> related to


Comment at: lib/Driver/Driver.cpp:1949
@@ -1837,1 +1948,3 @@
+SpecializedBuilders.push_back(new OpenMPActionBuilder(C, Args, Inputs));
+
 //

Since we can have both OpenMP offloading and CUDA, please add a test that the 
phases work correctly for that case (or that we produce an error if that can't 
currently work correctly).


https://reviews.llvm.org/D21845



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D21847: [Driver][OpenMP] Build jobs for OpenMP offloading actions for targets using gcc tool chains.

2016-09-28 Thread Hal Finkel via cfe-commits
hfinkel added inline comments.


Comment at: lib/Driver/Tools.cpp:243
@@ +242,3 @@
+// ignore inputs that refer to OpenMP offloading devices - they will be
+// embedded recurring to a proper linker script.
+if (auto *IA = II.getAction())

recurring -> according


Comment at: lib/Driver/Tools.cpp:334
@@ +333,3 @@
+  LksStream << "  OpenMP Offload Linker Script.\n";
+  LksStream << "*/\n";
+  LksStream << "TARGET(binary)\n";

We should also say 'autogenerated' somewhere in this comment.


Comment at: lib/Driver/Tools.cpp:386
@@ +385,3 @@
+  // Dump the contents of the linker script if the user requested that.
+  if (C.getArgs().hasArg(options::OPT_fopenmp_dump_offload_linker_script))
+llvm::errs() << LksBuffer;

I don't see why this is needed if we have -save-temps - I think we should 
remove this option entirely.


https://reviews.llvm.org/D21847



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D21848: [Driver][OpenMP] Add logic for offloading-specific argument translation.

2016-09-28 Thread Hal Finkel via cfe-commits
hfinkel added inline comments.


Comment at: lib/Driver/ToolChains.cpp:2834
@@ +2833,3 @@
+  // If this tool chain is used for an OpenMP offloading device we have to make
+  // sure we always generate a shared library regardless the commands the user
+  // passed to the host. This is required because the runtime library requires

regardless the -> regardless of the


Comment at: lib/Driver/ToolChains.cpp:2836
@@ +2835,3 @@
+  // passed to the host. This is required because the runtime library requires
+  // to load the device image dynamically at run time.
+  if (DeviceOffloadKind == Action::OFK_OpenMP) {

requires to load -> is required to load


Comment at: lib/Driver/ToolChains.cpp:2854
@@ +2853,3 @@
+  case options::OPT_shared:
+  case options::OPT_static:
+  case options::OPT_fPIC:

And also?

  case options::OPT_dynamic:


https://reviews.llvm.org/D21848



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D21852: [Driver][OpenMP] Update actions builder to create bundling action when necessary.

2016-09-28 Thread Hal Finkel via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


https://reviews.llvm.org/D21852



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D21853: [Driver][OpenMP] Update actions builder to create unbundling action when necessary.

2016-09-28 Thread Hal Finkel via cfe-commits
hfinkel added inline comments.


Comment at: include/clang/Driver/Action.h:504
@@ +503,3 @@
+  /// unbundling action.
+  struct DependingActionInfoTy final {
+/// \brief The tool chain of the depending action.

Don't need 'Ty' in the name of this struct.


Comment at: include/clang/Driver/Types.h:81
@@ +80,3 @@
+  /// isSrcFile - Is this a source file, i.e. something that still has to be
+  /// preprocessed. The logic behind this is the same that decides the first
+  /// compilation phase is a preprocessor one.

decided the first -> decides if the first


Comment at: include/clang/Driver/Types.h:82
@@ +81,3 @@
+  /// preprocessed. The logic behind this is the same that decides the first
+  /// compilation phase is a preprocessor one.
+  bool isSrcFile(ID Id);

preprocessor one -> preprocessing one


Comment at: lib/Driver/Driver.cpp:2091
@@ +2090,3 @@
+InputArg->getOption().getKind() == llvm::opt::Option::InputClass &&
+!types::isSrcFile(HostAction->getType())) {
+  auto UnbundlingHostAction =

This checks that the file needs to be preprocessed. What does preprocessing 
have to do with this? I don't imagine that providing a preprocessed source file 
as input should invoke the unbundler   .


Comment at: test/Driver/openmp-offload.c:274
@@ +273,3 @@
+/// Check separate compilation with offloading - unbundling actions
+// RUN:   touch %t.i
+// RUN:   %clang -### -ccc-print-phases -fopenmp -o %t.out -lsomelib -target 
powerpc64le-linux 
-fopenmp-targets=powerpc64le-ibm-linux-gnu,x86_64-pc-linux-gnu %t.i 2>&1 \

Oh, are you using .i to indicate a bundle instead of a preprocessed file? Don't 
do that. Please use a different suffix -- the bundler has its own file format.


https://reviews.llvm.org/D21853



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D21856: [Driver][OpenMP] Add support to create jobs for bundling actions.

2016-09-28 Thread Hal Finkel via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


https://reviews.llvm.org/D21856



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D21857: [Driver][OpenMP] Add support to create jobs for unbundling actions.

2016-09-28 Thread Hal Finkel via cfe-commits
hfinkel accepted this revision.
hfinkel added a comment.
This revision is now accepted and ready to land.

LGTM


https://reviews.llvm.org/D21857



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D21853: [Driver][OpenMP] Update actions builder to create unbundling action when necessary.

2016-09-28 Thread Hal Finkel via cfe-commits
hfinkel added inline comments.


Comment at: lib/Driver/Driver.cpp:2091
@@ +2090,3 @@
+InputArg->getOption().getKind() == llvm::opt::Option::InputClass &&
+!types::isSrcFile(HostAction->getType())) {
+  auto UnbundlingHostAction =

hfinkel wrote:
> This checks that the file needs to be preprocessed. What does preprocessing 
> have to do with this? I don't imagine that providing a preprocessed source 
> file as input should invoke the unbundler   .
On second thought, this is okay. It does not make sense to have a non-bundled 
preprocessed source for the input there, as the host and device compilation 
don't share a common preprocessor state.

We do need to be careful, perhaps, about .s files (which don't need 
preprocessing as .S files do) -- we should probably assume that all non-bundled 
.s files are host assembly code.


Comment at: test/Driver/openmp-offload.c:274
@@ +273,3 @@
+/// Check separate compilation with offloading - unbundling actions
+// RUN:   touch %t.i
+// RUN:   %clang -### -ccc-print-phases -fopenmp -o %t.out -lsomelib -target 
powerpc64le-linux 
-fopenmp-targets=powerpc64le-ibm-linux-gnu,x86_64-pc-linux-gnu %t.i 2>&1 \

hfinkel wrote:
> Oh, are you using .i to indicate a bundle instead of a preprocessed file? 
> Don't do that. Please use a different suffix -- the bundler has its own file 
> format.
Never mind; this is okay too.


https://reviews.llvm.org/D21853



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D24481: make “#pragma STDC FP_CONTRACT” on by default

2016-09-28 Thread Hal Finkel via cfe-commits
- Original Message -
> From: "Matthias Braun" 
> To: "Hal Finkel" 
> Cc: "Steve Canon" , "yaxun liu" , "a 
> skolnik" ,
> cfe-commits@lists.llvm.org, clatt...@apple.com, 
> reviews+d24481+public+c0b8d50a92298...@reviews.llvm.org
> Sent: Wednesday, September 28, 2016 2:55:44 PM
> Subject: Re: [PATCH] D24481: make “#pragma STDC FP_CONTRACT” on by default
> 
> 
> If we do this we should at least be targetted and restrict it to the
> tests that need the flag. For example you could put:
> list(APPEND CFLAGS -ffp-contract=off) to
> MultiSource/Applications/oggenc/CMakeLists.txt
> and
> CFLAGS += -ffp-contract=off into
> MultiSource/Applications/oggenc/Makefile
> 

I think this is a reasonable idea, with the understanding that this is going to 
turn into somewhat of a game of Whac-A-Mole (because you'll potentially get 
different FMAs on different architectures, and so the set of affected programs 
will be somewhat architecture dependent). Nevertheless, it shouldn't be too bad.

 -Hal

> 
> that at least forces us to think about why a specific benchmark fails
> and maybe we can find a way to rather use fpcmp/set an
> absolution/relative tollerance for when comparing the results
> (though admittedly I don't see how we can do that in a case of
> oggenc where a .ogg file is produced).
> 
> 
> - Matthias
> 
> 
> 
> 
> 
> On Sep 23, 2016, at 2:53 PM, Hal Finkel via cfe-commits <
> cfe-commits@lists.llvm.org > wrote:
> 
> We currently have logic in the test suite that sets -ffp-contract=off
> on PowerPC (because the default for GCC and other compilers on
> PowerPC/Linux systems is essentially -ffp-contract=fast). We might
> just want to do this now for all platforms.
> 
> -Hal
> 
> - Original Message -
> 
> 
> From: "Steve Canon" < sca...@apple.com >
> To: reviews+d24481+public+c0b8d50a92298...@reviews.llvm.org
> Cc: "a skolnik" < a.skol...@samsung.com >, clatt...@apple.com ,
> hfin...@anl.gov , "yaxun liu" < yaxun@amd.com >,
> cfe-commits@lists.llvm.org
> Sent: Friday, September 23, 2016 4:47:32 PM
> Subject: Re: [PATCH] D24481: make “#pragma STDC FP_CONTRACT” on by
> default
> 
> Without digging into them yet, these are almost caused by
> overly-sensitive tests that are erroneously expecting bit-exact
> results.
> 
> - Steve
> 
> Sent from my iPhone
> 
> 
> 
> On Sep 23, 2016, at 4:42 PM, Renato Golin < renato.go...@linaro.org >
> wrote:
> 
> rengolin added a subscriber: rengolin.
> rengolin added a comment.
> 
> Folks, this commit has broken both AArch64 test-suite buildbots:
> 
> http://lab.llvm.org:8011/builders/clang-cmake-aarch64-full/builds/3162
> 
> http://lab.llvm.org:8011/builders/clang-cmake-aarch64-quick/builds/10449
> 
> I have reverted in r282289, let me know if you need help testing on
> AArch64.
> 
> 
> Repository:
> rL LLVM
> 
> https://reviews.llvm.org/D24481
> 
> 
> 
> 
> 
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> 

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D18639: Use __builtin_isnan/isinf/isfinite in complex

2016-10-01 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D18639#545697, @EricWF wrote:

> LGTM.
>
> In https://reviews.llvm.org/D18639#514991, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D18639#491232, @mclow.lists wrote:
> >
> > > And is there any reason why `__libcpp_isinf` can't just return `false` 
> > > for non-fp types?
> >
> >
> > For custom numeric types that have an isinf, etc. found by ADL, they should 
> > continue to work.
>
>
> Do we already support custom numeric types? If so could you add a test for 
> this under `test/libcxx`? Just a simple test case that instantiates the 
> functions and shows it compiles.


As it turns out, the answer is: mostly. We should discuss this more: PR30589.


https://reviews.llvm.org/D18639



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libcxx] r283051 - Use __builtin_isnan/isinf/isfinite in complex

2016-10-01 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Sat Oct  1 15:38:31 2016
New Revision: 283051

URL: http://llvm.org/viewvc/llvm-project?rev=283051&view=rev
Log:
Use __builtin_isnan/isinf/isfinite in complex

The libc-provided isnan/isinf/isfinite macro implementations are specifically
designed to function correctly, even in the presence of -ffast-math (or, more
specifically, -ffinite-math-only). As such, on most implementation, these
either always turn into external function calls (e.g. glibc) or are
specifically function calls when FINITE_MATH_ONLY is defined (e.g. Darwin).

Our implementation of complex arithmetic makes heavy use of isnan/isinf/isfinite
to deal with corner cases involving non-finite quantities. This was problematic
in two respects:

  1. On systems where these are always function calls (e.g. Linux/glibc), there 
was a
 performance penalty
  2. When compiling with -ffast-math, there was a significant performance
 penalty (in fact, on Darwin and systems with similar implementations, the 
code
 may in fact be slower than not using -ffast-math, because the inline
 definitions provided by libc become unavailable to prevent the checks from
 being optimized out).

Eliding these inf/nan checks in -ffast-math mode is consistent with what
happens with libstdc++, and in my experience, what users expect. This is
critical to getting high-performance code when using complex. This change
replaces uses of those functions on basic floating-point types with calls to
__builtin_isnan/isinf/isfinite, which Clang will always expand inline. When
using -ffast-math (or -ffinite-math-only), the optimizer will remove the checks
as expected.

Differential Revision: https://reviews.llvm.org/D18639

Modified:
libcxx/trunk/include/cmath
libcxx/trunk/include/complex

Modified: libcxx/trunk/include/cmath
URL: 
http://llvm.org/viewvc/llvm-project/libcxx/trunk/include/cmath?rev=283051&r1=283050&r2=283051&view=diff
==
--- libcxx/trunk/include/cmath (original)
+++ libcxx/trunk/include/cmath Sat Oct  1 15:38:31 2016
@@ -558,6 +558,66 @@ hypot(_A1 __lcpp_x, _A2 __lcpp_y, _A3 __
 }
 #endif
 
+template 
+_LIBCPP_ALWAYS_INLINE
+typename enable_if::value, bool>::type
+__libcpp_isnan(_A1 __lcpp_x) _NOEXCEPT
+{
+#if __has_builtin(__builtin_isnan)
+return __builtin_isnan(__lcpp_x);
+#else
+return isnan(__lcpp_x);
+#endif
+}
+
+template 
+_LIBCPP_ALWAYS_INLINE
+typename enable_if::value, bool>::type
+__libcpp_isnan(_A1 __lcpp_x) _NOEXCEPT
+{
+return isnan(__lcpp_x);
+}
+
+template 
+_LIBCPP_ALWAYS_INLINE
+typename enable_if::value, bool>::type
+__libcpp_isinf(_A1 __lcpp_x) _NOEXCEPT
+{
+#if __has_builtin(__builtin_isinf)
+return __builtin_isinf(__lcpp_x);
+#else
+return isinf(__lcpp_x);
+#endif
+}
+
+template 
+_LIBCPP_ALWAYS_INLINE
+typename enable_if::value, bool>::type
+__libcpp_isinf(_A1 __lcpp_x) _NOEXCEPT
+{
+return isinf(__lcpp_x);
+}
+
+template 
+_LIBCPP_ALWAYS_INLINE
+typename enable_if::value, bool>::type
+__libcpp_isfinite(_A1 __lcpp_x) _NOEXCEPT
+{
+#if __has_builtin(__builtin_isfinite)
+return __builtin_isfinite(__lcpp_x);
+#else
+return isfinite(__lcpp_x);
+#endif
+}
+
+template 
+_LIBCPP_ALWAYS_INLINE
+typename enable_if::value, bool>::type
+__libcpp_isfinite(_A1 __lcpp_x) _NOEXCEPT
+{
+return isfinite(__lcpp_x);
+}
+
 _LIBCPP_END_NAMESPACE_STD
 
 #endif  // _LIBCPP_CMATH

Modified: libcxx/trunk/include/complex
URL: 
http://llvm.org/viewvc/llvm-project/libcxx/trunk/include/complex?rev=283051&r1=283050&r2=283051&view=diff
==
--- libcxx/trunk/include/complex (original)
+++ libcxx/trunk/include/complex Sat Oct  1 15:38:31 2016
@@ -599,39 +599,39 @@ operator*(const complex<_Tp>& __z, const
 _Tp __bc = __b * __c;
 _Tp __x = __ac - __bd;
 _Tp __y = __ad + __bc;
-if (isnan(__x) && isnan(__y))
+if (__libcpp_isnan(__x) && __libcpp_isnan(__y))
 {
 bool __recalc = false;
-if (isinf(__a) || isinf(__b))
+if (__libcpp_isinf(__a) || __libcpp_isinf(__b))
 {
-__a = copysign(isinf(__a) ? _Tp(1) : _Tp(0), __a);
-__b = copysign(isinf(__b) ? _Tp(1) : _Tp(0), __b);
-if (isnan(__c))
+__a = copysign(__libcpp_isinf(__a) ? _Tp(1) : _Tp(0), __a);
+__b = copysign(__libcpp_isinf(__b) ? _Tp(1) : _Tp(0), __b);
+if (__libcpp_isnan(__c))
 __c = copysign(_Tp(0), __c);
-if (isnan(__d))
+if (__libcpp_isnan(__d))
 __d = copysign(_Tp(0), __d);
 __recalc = true;
 }
-if (isinf(__c) || isinf(__d))
+if (__libcpp_isinf(__c) || __libcpp_isinf(__d))
 {
-__c = copysign(isinf(__c) ? _Tp(1) : _Tp(0), __c);
-__d = copysign(isinf(__d) ? _Tp(1) : _Tp(0), __d);
-if (isnan(__a))
+__c = copysign(__libcpp_isinf(__c

[PATCH] D18639: Use __builtin_isnan/isinf/isfinite in complex

2016-10-01 Thread Hal Finkel via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL283051: Use __builtin_isnan/isinf/isfinite in complex 
(authored by hfinkel).

Changed prior to commit:
  https://reviews.llvm.org/D18639?vs=67992&id=73202#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D18639

Files:
  libcxx/trunk/include/cmath
  libcxx/trunk/include/complex

Index: libcxx/trunk/include/cmath
===
--- libcxx/trunk/include/cmath
+++ libcxx/trunk/include/cmath
@@ -558,6 +558,66 @@
 }
 #endif
 
+template 
+_LIBCPP_ALWAYS_INLINE
+typename enable_if::value, bool>::type
+__libcpp_isnan(_A1 __lcpp_x) _NOEXCEPT
+{
+#if __has_builtin(__builtin_isnan)
+return __builtin_isnan(__lcpp_x);
+#else
+return isnan(__lcpp_x);
+#endif
+}
+
+template 
+_LIBCPP_ALWAYS_INLINE
+typename enable_if::value, bool>::type
+__libcpp_isnan(_A1 __lcpp_x) _NOEXCEPT
+{
+return isnan(__lcpp_x);
+}
+
+template 
+_LIBCPP_ALWAYS_INLINE
+typename enable_if::value, bool>::type
+__libcpp_isinf(_A1 __lcpp_x) _NOEXCEPT
+{
+#if __has_builtin(__builtin_isinf)
+return __builtin_isinf(__lcpp_x);
+#else
+return isinf(__lcpp_x);
+#endif
+}
+
+template 
+_LIBCPP_ALWAYS_INLINE
+typename enable_if::value, bool>::type
+__libcpp_isinf(_A1 __lcpp_x) _NOEXCEPT
+{
+return isinf(__lcpp_x);
+}
+
+template 
+_LIBCPP_ALWAYS_INLINE
+typename enable_if::value, bool>::type
+__libcpp_isfinite(_A1 __lcpp_x) _NOEXCEPT
+{
+#if __has_builtin(__builtin_isfinite)
+return __builtin_isfinite(__lcpp_x);
+#else
+return isfinite(__lcpp_x);
+#endif
+}
+
+template 
+_LIBCPP_ALWAYS_INLINE
+typename enable_if::value, bool>::type
+__libcpp_isfinite(_A1 __lcpp_x) _NOEXCEPT
+{
+return isfinite(__lcpp_x);
+}
+
 _LIBCPP_END_NAMESPACE_STD
 
 #endif  // _LIBCPP_CMATH
Index: libcxx/trunk/include/complex
===
--- libcxx/trunk/include/complex
+++ libcxx/trunk/include/complex
@@ -599,39 +599,39 @@
 _Tp __bc = __b * __c;
 _Tp __x = __ac - __bd;
 _Tp __y = __ad + __bc;
-if (isnan(__x) && isnan(__y))
+if (__libcpp_isnan(__x) && __libcpp_isnan(__y))
 {
 bool __recalc = false;
-if (isinf(__a) || isinf(__b))
+if (__libcpp_isinf(__a) || __libcpp_isinf(__b))
 {
-__a = copysign(isinf(__a) ? _Tp(1) : _Tp(0), __a);
-__b = copysign(isinf(__b) ? _Tp(1) : _Tp(0), __b);
-if (isnan(__c))
+__a = copysign(__libcpp_isinf(__a) ? _Tp(1) : _Tp(0), __a);
+__b = copysign(__libcpp_isinf(__b) ? _Tp(1) : _Tp(0), __b);
+if (__libcpp_isnan(__c))
 __c = copysign(_Tp(0), __c);
-if (isnan(__d))
+if (__libcpp_isnan(__d))
 __d = copysign(_Tp(0), __d);
 __recalc = true;
 }
-if (isinf(__c) || isinf(__d))
+if (__libcpp_isinf(__c) || __libcpp_isinf(__d))
 {
-__c = copysign(isinf(__c) ? _Tp(1) : _Tp(0), __c);
-__d = copysign(isinf(__d) ? _Tp(1) : _Tp(0), __d);
-if (isnan(__a))
+__c = copysign(__libcpp_isinf(__c) ? _Tp(1) : _Tp(0), __c);
+__d = copysign(__libcpp_isinf(__d) ? _Tp(1) : _Tp(0), __d);
+if (__libcpp_isnan(__a))
 __a = copysign(_Tp(0), __a);
-if (isnan(__b))
+if (__libcpp_isnan(__b))
 __b = copysign(_Tp(0), __b);
 __recalc = true;
 }
-if (!__recalc && (isinf(__ac) || isinf(__bd) ||
-  isinf(__ad) || isinf(__bc)))
+if (!__recalc && (__libcpp_isinf(__ac) || __libcpp_isinf(__bd) ||
+  __libcpp_isinf(__ad) || __libcpp_isinf(__bc)))
 {
-if (isnan(__a))
+if (__libcpp_isnan(__a))
 __a = copysign(_Tp(0), __a);
-if (isnan(__b))
+if (__libcpp_isnan(__b))
 __b = copysign(_Tp(0), __b);
-if (isnan(__c))
+if (__libcpp_isnan(__c))
 __c = copysign(_Tp(0), __c);
-if (isnan(__d))
+if (__libcpp_isnan(__d))
 __d = copysign(_Tp(0), __d);
 __recalc = true;
 }
@@ -674,33 +674,33 @@
 _Tp __c = __w.real();
 _Tp __d = __w.imag();
 _Tp __logbw = logb(fmax(fabs(__c), fabs(__d)));
-if (isfinite(__logbw))
+if (__libcpp_isfinite(__logbw))
 {
 __ilogbw = static_cast(__logbw);
 __c = scalbn(__c, -__ilogbw);
 __d = scalbn(__d, -__ilogbw);
 }
 _Tp __denom = __c * __c + __d * __d;
 _Tp __x = scalbn((__a * __c + __b * __d) / __denom, -__ilogbw);
 _Tp __y = scalbn((__b * __c - __a * __d) / __denom, -__ilogbw);
-if (isnan(__x) && isnan(__y))
+if (__libcpp_isnan(__x) && __libcpp_isnan(__y))
 {
-if ((__denom == _Tp(0)) && (!isnan(__a) || !isnan(__b)))
+if ((__denom == _Tp(0)) && (!__l

[libcxx] r283052 - Remove some additional unnecessary std:: in cmath

2016-10-01 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Sat Oct  1 15:38:44 2016
New Revision: 283052

URL: http://llvm.org/viewvc/llvm-project?rev=283052&view=rev
Log:
Remove some additional unnecessary std:: in cmath

Unlike in math.h, as Eric pointed out in the review of D18639, we don't need
the std:: in cmath.

Modified:
libcxx/trunk/include/cmath

Modified: libcxx/trunk/include/cmath
URL: 
http://llvm.org/viewvc/llvm-project/libcxx/trunk/include/cmath?rev=283052&r1=283051&r2=283052&view=diff
==
--- libcxx/trunk/include/cmath (original)
+++ libcxx/trunk/include/cmath Sat Oct  1 15:38:44 2016
@@ -541,19 +541,19 @@ inline _LIBCPP_INLINE_VISIBILITY long do
 
 template 
 inline _LIBCPP_INLINE_VISIBILITY
-typename std::__lazy_enable_if
+typename __lazy_enable_if
 <
-std::is_arithmetic<_A1>::value &&
-std::is_arithmetic<_A2>::value &&
-std::is_arithmetic<_A3>::value,
-std::__promote<_A1, _A2, _A3>
+is_arithmetic<_A1>::value &&
+is_arithmetic<_A2>::value &&
+is_arithmetic<_A3>::value,
+__promote<_A1, _A2, _A3>
 >::type
 hypot(_A1 __lcpp_x, _A2 __lcpp_y, _A3 __lcpp_z) _NOEXCEPT
 {
-typedef typename std::__promote<_A1, _A2, _A3>::type __result_type;
-static_assert((!(std::is_same<_A1, __result_type>::value &&
- std::is_same<_A2, __result_type>::value &&
- std::is_same<_A3, __result_type>::value)), "");
+typedef typename __promote<_A1, _A2, _A3>::type __result_type;
+static_assert((!(is_same<_A1, __result_type>::value &&
+ is_same<_A2, __result_type>::value &&
+ is_same<_A3, __result_type>::value)), "");
 return hypot((__result_type)__lcpp_x, (__result_type)__lcpp_y, 
(__result_type)__lcpp_z);
 }
 #endif


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r283061 - [PowerPC] Enable soft-float for PPC64, and +soft-float -> -hard-float

2016-10-01 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Sat Oct  1 21:10:45 2016
New Revision: 283061

URL: http://llvm.org/viewvc/llvm-project?rev=283061&view=rev
Log:
[PowerPC] Enable soft-float for PPC64, and +soft-float -> -hard-float

Enable soft-float support on PPC64, as the backend now supports it. Also, the
backend now uses -hard-float instead of +soft-float, so set the target features
accordingly.

Fixes PR26970.

Added:
cfe/trunk/test/CodeGen/ppc64-soft-float.c
Modified:
cfe/trunk/lib/CodeGen/TargetInfo.cpp
cfe/trunk/lib/Driver/Tools.cpp
cfe/trunk/test/Driver/ppc-features.cpp

Modified: cfe/trunk/lib/CodeGen/TargetInfo.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/TargetInfo.cpp?rev=283061&r1=283060&r2=283061&view=diff
==
--- cfe/trunk/lib/CodeGen/TargetInfo.cpp (original)
+++ cfe/trunk/lib/CodeGen/TargetInfo.cpp Sat Oct  1 21:10:45 2016
@@ -3899,6 +3899,7 @@ private:
   static const unsigned GPRBits = 64;
   ABIKind Kind;
   bool HasQPX;
+  bool IsSoftFloatABI;
 
   // A vector of float or double will be promoted to <4 x f32> or <4 x f64> and
   // will be passed in a QPX register.
@@ -3929,8 +3930,10 @@ private:
   }
 
 public:
-  PPC64_SVR4_ABIInfo(CodeGen::CodeGenTypes &CGT, ABIKind Kind, bool HasQPX)
-  : ABIInfo(CGT), Kind(Kind), HasQPX(HasQPX) {}
+  PPC64_SVR4_ABIInfo(CodeGen::CodeGenTypes &CGT, ABIKind Kind, bool HasQPX,
+ bool SoftFloatABI)
+  : ABIInfo(CGT), Kind(Kind), HasQPX(HasQPX),
+IsSoftFloatABI(SoftFloatABI) {}
 
   bool isPromotableTypeForABI(QualType Ty) const;
   CharUnits getParamTypeAlignment(QualType Ty) const;
@@ -3978,8 +3981,10 @@ class PPC64_SVR4_TargetCodeGenInfo : pub
 
 public:
   PPC64_SVR4_TargetCodeGenInfo(CodeGenTypes &CGT,
-   PPC64_SVR4_ABIInfo::ABIKind Kind, bool HasQPX)
-  : TargetCodeGenInfo(new PPC64_SVR4_ABIInfo(CGT, Kind, HasQPX)) {}
+   PPC64_SVR4_ABIInfo::ABIKind Kind, bool HasQPX,
+   bool SoftFloatABI)
+  : TargetCodeGenInfo(new PPC64_SVR4_ABIInfo(CGT, Kind, HasQPX,
+ SoftFloatABI)) {}
 
   int getDwarfEHStackPointer(CodeGen::CodeGenModule &M) const override {
 // This is recovered from gcc output.
@@ -4197,8 +4202,11 @@ bool PPC64_SVR4_ABIInfo::isHomogeneousAg
   if (const BuiltinType *BT = Ty->getAs()) {
 if (BT->getKind() == BuiltinType::Float ||
 BT->getKind() == BuiltinType::Double ||
-BT->getKind() == BuiltinType::LongDouble)
+BT->getKind() == BuiltinType::LongDouble) {
+  if (IsSoftFloatABI)
+return false;
   return true;
+}
   }
   if (const VectorType *VT = Ty->getAs()) {
 if (getContext().getTypeSize(VT) == 128 || IsQPXVectorTy(Ty))
@@ -8107,8 +8115,10 @@ const TargetCodeGenInfo &CodeGenModule::
   if (getTarget().getABI() == "elfv2")
 Kind = PPC64_SVR4_ABIInfo::ELFv2;
   bool HasQPX = getTarget().getABI() == "elfv1-qpx";
+  bool IsSoftFloat = CodeGenOpts.FloatABI == "soft";
 
-  return SetCGInfo(new PPC64_SVR4_TargetCodeGenInfo(Types, Kind, HasQPX));
+  return SetCGInfo(new PPC64_SVR4_TargetCodeGenInfo(Types, Kind, HasQPX,
+IsSoftFloat));
 } else
   return SetCGInfo(new PPC64TargetCodeGenInfo(Types));
   case llvm::Triple::ppc64le: {
@@ -8117,8 +8127,10 @@ const TargetCodeGenInfo &CodeGenModule::
 if (getTarget().getABI() == "elfv1" || getTarget().getABI() == "elfv1-qpx")
   Kind = PPC64_SVR4_ABIInfo::ELFv1;
 bool HasQPX = getTarget().getABI() == "elfv1-qpx";
+bool IsSoftFloat = CodeGenOpts.FloatABI == "soft";
 
-return SetCGInfo(new PPC64_SVR4_TargetCodeGenInfo(Types, Kind, HasQPX));
+return SetCGInfo(new PPC64_SVR4_TargetCodeGenInfo(Types, Kind, HasQPX,
+  IsSoftFloat));
   }
 
   case llvm::Triple::nvptx:

Modified: cfe/trunk/lib/Driver/Tools.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Tools.cpp?rev=283061&r1=283060&r2=283061&view=diff
==
--- cfe/trunk/lib/Driver/Tools.cpp (original)
+++ cfe/trunk/lib/Driver/Tools.cpp Sat Oct  1 21:10:45 2016
@@ -1591,15 +1591,8 @@ static void getPPCTargetFeatures(const D
   handleTargetFeaturesGroup(Args, Features, options::OPT_m_ppc_Features_Group);
 
   ppc::FloatABI FloatABI = ppc::getPPCFloatABI(D, Args);
-  if (FloatABI == ppc::FloatABI::Soft &&
-  !(Triple.getArch() == llvm::Triple::ppc64 ||
-Triple.getArch() == llvm::Triple::ppc64le))
-Features.push_back("+soft-float");
-  else if (FloatABI == ppc::FloatABI::Soft &&
-   (Triple.getArch() == llvm::Triple::ppc64 ||
-Triple.getArch() == llvm::Triple::ppc64le))
-D.Diag(diag::err_drv_invalid_mfloat_abi)
-<< "soft float is not supported

[PATCH] D24907: NFC: separate file for fp denormal regression tests

2016-10-03 Thread Hal Finkel via cfe-commits
hfinkel accepted this revision.
hfinkel added a reviewer: hfinkel.
hfinkel added a comment.
This revision is now accepted and ready to land.

This LGTM, although I find "fast-math.c" much easier to read than 
"denormalfpmode.c". How about naming it "denormal-fp-math.c" to match the 
option name?


https://reviews.llvm.org/D24907



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24909: fix for not copying fp denormal and trapping options.

2016-10-03 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

I'm fine with setting these for consistency, but I don't understand why our 
failure to do this would cause problems. If you look in lib/CodeGen/CGCall.cpp 
and you'll see:

  if (!CodeGenOpts.FPDenormalMode.empty())
FuncAttrs.addAttribute("denormal-fp-math",
   CodeGenOpts.FPDenormalMode);
  
  FuncAttrs.addAttribute("no-trapping-math",
 llvm::toStringRef(CodeGenOpts.NoTrappingMath));
  FuncAttrs.addAttribute("no-infs-fp-math",

and the code in TargetMachine::resetTargetOptions in Target/TargetMachine.cpp 
has this:

  RESET_OPTION(NoTrappingFPMath, "no-trapping-math");
  
  StringRef Denormal =
F.getFnAttribute("denormal-fp-math").getValueAsString();
  if (Denormal == "ieee")
Options.FPDenormalType = FPDenormal::IEEE;
  else if (Denormal == "preserve-sign")
Options.FPDenormalType = FPDenormal::PreserveSign;
  else if (Denormal == "positive-zero")
Options.FPDenormalType = FPDenormal::PositiveZero;

so this should all work regardless.

Also, please post your patches with full context. Please see 
http://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface 
for instructions.


https://reviews.llvm.org/D24909



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D24909: fix for not copying fp denormal and trapping options.

2016-10-03 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D24909#559110, @SjoerdMeijer wrote:

> Hi Hal, 
>  Thanks for reviewing and you're right: this should work. We actually have 
> actually some downstream (aarch64) build attribute selection code that would 
> work better with this change. Are you okay with committing this change?
>  Cheers,
>  Sjoerd.


Can you explain what you mean? What does this code do?

My general impression is that we're planning to rip out this code entirely and 
reply only on setting the functions attributes. @echristo, am I right about 
this?


https://reviews.llvm.org/D24909



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r283141 - [analyzer] A blind attempt to fix a buildbot after r283092.

2016-10-03 Thread Hal Finkel via cfe-commits
- Original Message -
> From: "Artem Dergachev via cfe-commits" 
> To: cfe-commits@lists.llvm.org
> Sent: Monday, October 3, 2016 3:12:13 PM
> Subject: r283141 - [analyzer] A blind attempt to fix a buildbot after r283092.
> 
> Author: dergachev
> Date: Mon Oct  3 15:12:12 2016
> New Revision: 283141
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=283141&view=rev
> Log:
> [analyzer] A blind attempt to fix a buildbot after r283092.
> 
> The msvc compiler seems to crash compiling the BugReport class.

When you commit a work-around like this, please add a comment explaining what's 
going on. In this case, that we're using std::vector here instead of 
SmallVector because using SmallVector causes an ICE in MSVC version whatever 
(at optimization level whatever).

 -Hal

> 
> Modified:
> cfe/trunk/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h
> 
> Modified:
> cfe/trunk/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h
> URL:
> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h?rev=283141&r1=283140&r2=283141&view=diff
> ==
> ---
> cfe/trunk/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h
> (original)
> +++
> cfe/trunk/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h
> Mon Oct  3 15:12:12 2016
> @@ -66,7 +66,7 @@ public:
>typedef SmallVector, 8>
>VisitorList;
>typedef VisitorList::iterator visitor_iterator;
>typedef SmallVector ExtraTextList;
> -  typedef
> SmallVector, 4>
> +  typedef
> std::vector>
>NoteList;
>  
>  protected:
> 
> 
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> 

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r283141 - [analyzer] A blind attempt to fix a buildbot after r283092.

2016-10-03 Thread Hal Finkel via cfe-commits
- Original Message -
> From: "Artem Dergachev" 
> To: "Hal Finkel" , "Artem Dergachev" 
> 
> Cc: cfe-commits@lists.llvm.org
> Sent: Monday, October 3, 2016 3:40:02 PM
> Subject: Re: r283141 - [analyzer] A blind attempt to fix a buildbot after 
> r283092.
> 
> 03/10/2016 23:29, Hal Finkel пишет:
> > - Original Message -
> >> From: "Artem Dergachev via cfe-commits"
> >> 
> >> To: cfe-commits@lists.llvm.org
> >> Sent: Monday, October 3, 2016 3:12:13 PM
> >> Subject: r283141 - [analyzer] A blind attempt to fix a buildbot
> >> after r283092.
> >>
> >> Author: dergachev
> >> Date: Mon Oct  3 15:12:12 2016
> >> New Revision: 283141
> >>
> >> URL: http://llvm.org/viewvc/llvm-project?rev=283141&view=rev
> >> Log:
> >> [analyzer] A blind attempt to fix a buildbot after r283092.
> >>
> >> The msvc compiler seems to crash compiling the BugReport class.
> > When you commit a work-around like this, please add a comment
> > explaining what's going on. In this case, that we're using
> > std::vector here instead of SmallVector because using SmallVector
> > causes an ICE in MSVC version whatever (at optimization level
> > whatever).
> >
> >   -Hal
> 
> Yep, sorry, will add a comment if this actually helps; thanks for
> clarifying, i hesitated.

Thanks! It helps because, at some point, we like to get rid of these things :-)

 -Hal

> 
> >> Modified:
> >>  cfe/trunk/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h
> >>
> >> Modified:
> >> cfe/trunk/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h?rev=283141&r1=283140&r2=283141&view=diff
> >> ==
> >> ---
> >> cfe/trunk/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h
> >> (original)
> >> +++
> >> cfe/trunk/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h
> >> Mon Oct  3 15:12:12 2016
> >> @@ -66,7 +66,7 @@ public:
> >> typedef SmallVector, 8>
> >> VisitorList;
> >> typedef VisitorList::iterator visitor_iterator;
> >> typedef SmallVector ExtraTextList;
> >> -  typedef
> >> SmallVector, 4>
> >> +  typedef
> >> std::vector>
> >> NoteList;
> >>   
> >>   protected:
> >>
> >>
> >> ___
> >> cfe-commits mailing list
> >> cfe-commits@lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> >>
> 
> 

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D9403: llvm.noalias - Clang CodeGen for local restrict-qualified pointers

2016-10-03 Thread Hal Finkel via cfe-commits
hfinkel added inline comments.


> rjmccall wrote in CGStmt.cpp:525
> It's much more likely that NoAliasScopes will be empty than that MemoryInsts 
> will be empty.  You should probably fast-path using that, or better yet, with 
> the RecordMemoryInsts bit.

I'm not sure that's true; we only record memory-accessing instructions in the 
first place if there are relevant restrict-qualified pointers around.

> rjmccall wrote in CodeGenFunction.cpp:1900
> Is it intentional that this includes calls and invokes?  If so, please leave 
> a comment describing which instructions we want to apply this to and why.
> 
> In general, this entire patch is really light on comments.

> In general, this entire patch is really light on comments.

Agreed; adding more comments...

> rjmccall wrote in CodeGenFunction.h:541
> Why this is defaultable?

It seems like a reasonable default (i.e. we default to not recording memory 
instructions); in the current patch, this is used by the `LexicalNoAliasInfo 
FnNoAliasInfo;` member of CodeGenFunction.

https://reviews.llvm.org/D9403



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D9403: llvm.noalias - Clang CodeGen for local restrict-qualified pointers

2016-10-03 Thread Hal Finkel via cfe-commits
hfinkel updated this revision to Diff 73344.
hfinkel added a comment.

Rebased; added more comments and addressed other review feedback.


https://reviews.llvm.org/D9403

Files:
  lib/CodeGen/CGDecl.cpp
  lib/CodeGen/CGExpr.cpp
  lib/CodeGen/CGStmt.cpp
  lib/CodeGen/CodeGenFunction.cpp
  lib/CodeGen/CodeGenFunction.h
  test/CodeGen/noalias.c
  test/OpenMP/taskloop_firstprivate_codegen.cpp
  test/OpenMP/taskloop_lastprivate_codegen.cpp
  test/OpenMP/taskloop_private_codegen.cpp
  test/OpenMP/taskloop_simd_firstprivate_codegen.cpp
  test/OpenMP/taskloop_simd_lastprivate_codegen.cpp
  test/OpenMP/taskloop_simd_private_codegen.cpp

Index: test/OpenMP/taskloop_simd_private_codegen.cpp
===
--- test/OpenMP/taskloop_simd_private_codegen.cpp
+++ test/OpenMP/taskloop_simd_private_codegen.cpp
@@ -223,7 +223,8 @@
 // CHECK: [[PRIV_S_ARR_ADDR:%.+]] = alloca [2 x [[S_DOUBLE_TY]]]*,
 // CHECK: [[PRIV_VEC_ADDR:%.+]] = alloca [2 x i32]*,
 // CHECK: [[PRIV_SIVAR_ADDR:%.+]] = alloca i32*,
-// CHECK: store void (i8*, ...)* bitcast (void ([[PRIVATES_MAIN_TY]]*, [[S_DOUBLE_TY]]**, i32**, [2 x [[S_DOUBLE_TY]]]**, [2 x i32]**, i32**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*), void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
+// CHECK: [[PRIV_NAFP:%.+]] = call void (i8*, ...)* @llvm.noalias.p0f_isVoidp0i8varargf(void (i8*, ...)* bitcast (void ([[PRIVATES_MAIN_TY]]*, [[S_DOUBLE_TY]]**, i32**, [2 x [[S_DOUBLE_TY]]]**, [2 x i32]**, i32**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*),
+// CHECK: store void (i8*, ...)* [[PRIV_NAFP]], void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
 // CHECK: [[MAP_FN:%.+]] = load void (i8*, ...)*, void (i8*, ...)** [[MAP_FN_ADDR]],
 // CHECK: call void (i8*, ...) [[MAP_FN]](i8* %{{.+}}, [[S_DOUBLE_TY]]** [[PRIV_VAR_ADDR]], i32** [[PRIV_T_VAR_ADDR]], [2 x [[S_DOUBLE_TY]]]** [[PRIV_S_ARR_ADDR]], [2 x i32]** [[PRIV_VEC_ADDR]], i32** [[PRIV_SIVAR_ADDR]])
 // CHECK: [[PRIV_VAR:%.+]] = load [[S_DOUBLE_TY]]*, [[S_DOUBLE_TY]]** [[PRIV_VAR_ADDR]],
@@ -351,7 +352,8 @@
 // CHECK-DAG: [[PRIV_VEC_ADDR:%.+]] = alloca [2 x i32]*,
 // CHECK-DAG: [[PRIV_S_ARR_ADDR:%.+]] = alloca [2 x [[S_INT_TY]]]*,
 // CHECK-DAG: [[PRIV_VAR_ADDR:%.+]] = alloca [[S_INT_TY]]*,
-// CHECK: store void (i8*, ...)* bitcast (void ([[PRIVATES_TMAIN_TY]]*, i32**, [2 x i32]**, [2 x [[S_INT_TY]]]**, [[S_INT_TY]]**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*), void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
+// CHECK: [[PRIV_NAFP:%.+]] = call void (i8*, ...)* @llvm.noalias.p0f_isVoidp0i8varargf(void (i8*, ...)* bitcast (void ([[PRIVATES_TMAIN_TY]]*, i32**, [2 x i32]**, [2 x [[S_INT_TY]]]**, [[S_INT_TY]]**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*),
+// CHECK: store void (i8*, ...)* [[PRIV_NAFP]], void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
 // CHECK: [[MAP_FN:%.+]] = load void (i8*, ...)*, void (i8*, ...)** [[MAP_FN_ADDR]],
 // CHECK: call void (i8*, ...) [[MAP_FN]](i8* %{{.+}}, i32** [[PRIV_T_VAR_ADDR]], [2 x i32]** [[PRIV_VEC_ADDR]], [2 x [[S_INT_TY]]]** [[PRIV_S_ARR_ADDR]], [[S_INT_TY]]** [[PRIV_VAR_ADDR]])
 // CHECK: [[PRIV_T_VAR:%.+]] = load i32*, i32** [[PRIV_T_VAR_ADDR]],
Index: test/OpenMP/taskloop_simd_lastprivate_codegen.cpp
===
--- test/OpenMP/taskloop_simd_lastprivate_codegen.cpp
+++ test/OpenMP/taskloop_simd_lastprivate_codegen.cpp
@@ -258,7 +258,8 @@
 // CHECK: [[PRIV_S_ARR_ADDR:%.+]] = alloca [2 x [[S_DOUBLE_TY]]]*,
 // CHECK: [[PRIV_VEC_ADDR:%.+]] = alloca [2 x i32]*,
 // CHECK: [[PRIV_SIVAR_ADDR:%.+]] = alloca i32*,
-// CHECK: store void (i8*, ...)* bitcast (void ([[PRIVATES_MAIN_TY]]*, [[S_DOUBLE_TY]]**, i32**, [2 x [[S_DOUBLE_TY]]]**, [2 x i32]**, i32**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*), void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
+// CHECK: [[PRIV_NAFP:%.+]] = call void (i8*, ...)* @llvm.noalias.p0f_isVoidp0i8varargf(void (i8*, ...)* bitcast (void ([[PRIVATES_MAIN_TY]]*, [[S_DOUBLE_TY]]**, i32**, [2 x [[S_DOUBLE_TY]]]**, [2 x i32]**, i32**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*),
+// CHECK: store void (i8*, ...)* [[PRIV_NAFP]], void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
 // CHECK: [[MAP_FN:%.+]] = load void (i8*, ...)*, void (i8*, ...)** [[MAP_FN_ADDR]],
 
 // CHECK: call void (i8*, ...) [[MAP_FN]](i8* %{{.+}}, [[S_DOUBLE_TY]]** [[PRIV_VAR_ADDR]], i32** [[PRIV_T_VAR_ADDR]], [2 x [[S_DOUBLE_TY]]]** [[PRIV_S_ARR_ADDR]], [2 x i32]** [[PRIV_VEC_ADDR]], i32** [[PRIV_SIVAR_ADDR]])
@@ -426,7 +427,8 @@
 // CHECK-DAG: [[PRIV_VEC_ADDR:%.+]] = alloca [2 x i32]*,
 // CHECK-DAG: [[PRIV_S_ARR_ADDR:%.+]] = alloca [2 x [[S_INT_TY]]]*,
 // CHECK-DAG: [[PRIV_VAR_ADDR:%.+]] = alloca [[S_INT_TY]]*,
-// CHECK: store void (i8*, ...)* bitcast (void ([[PRIVATES_TMAIN_TY]]*, i32**, [2 x i32]**, [2 x [[S_INT_TY]]]**, [[S_INT_TY]]**)* [[PRIVATES_MAP_FN]] to void (i8*, ...)*), void (i8*, ...)** [[MAP_FN_ADDR:%.+]],
+// CHECK: [[PRIV_NAFP:%.+]] = call void (i8*, ...)* @llvm.noalias.p0f_isVoidp0i8varargf(void (i8*, ...)* bitcast (void ([[PRIVATES_TMAIN_TY]]*, i32**,

[PATCH] D22189: llvm.noalias - Clang CodeGen - check restrict variable map only for restrict-qualified lvalues

2016-10-03 Thread Hal Finkel via cfe-commits
hfinkel updated this revision to Diff 73345.
hfinkel added a comment.

Rebased


https://reviews.llvm.org/D22189

Files:
  lib/CodeGen/CGDeclCXX.cpp
  lib/CodeGen/CGExpr.cpp
  lib/CodeGen/CGOpenMPRuntime.cpp
  lib/CodeGen/CGStmtOpenMP.cpp
  lib/CodeGen/CodeGenFunction.h


Index: lib/CodeGen/CodeGenFunction.h
===
--- lib/CodeGen/CodeGenFunction.h
+++ lib/CodeGen/CodeGenFunction.h
@@ -2749,7 +2749,7 @@
   /// care to appropriately convert from the memory representation to
   /// the LLVM value representation.
   void EmitStoreOfScalar(llvm::Value *Value, Address Addr,
- bool Volatile, QualType Ty,
+ bool Volatile, bool Restrict, QualType Ty,
  AlignmentSource AlignSource = AlignmentSource::Type,
  llvm::MDNode *TBAAInfo = nullptr, bool isInit = false,
  QualType TBAABaseTy = QualType(),
Index: lib/CodeGen/CGStmtOpenMP.cpp
===
--- lib/CodeGen/CGStmtOpenMP.cpp
+++ lib/CodeGen/CGStmtOpenMP.cpp
@@ -271,7 +271,7 @@
 Address RefAddr = CreateMemTemp(CurVD->getType(), getPointerAlign(),
 ".materialized_ref");
 EmitStoreOfScalar(LocalAddr.getPointer(), RefAddr, /*Volatile=*/false,
-  CurVD->getType());
+  /*Restrict=*/false, CurVD->getType());
 LocalAddr = RefAddr;
   }
   setAddrOfLocalVar(CurVD, LocalAddr);
Index: lib/CodeGen/CGOpenMPRuntime.cpp
===
--- lib/CodeGen/CGOpenMPRuntime.cpp
+++ lib/CodeGen/CGOpenMPRuntime.cpp
@@ -6762,7 +6762,8 @@
  CounterVal->getType(), 
Int64Ty,
  CounterVal->getExprLoc());
   Address CntAddr = CGF.CreateMemTemp(Int64Ty, ".cnt.addr");
-  CGF.EmitStoreOfScalar(CntVal, CntAddr, /*Volatile=*/false, Int64Ty);
+  CGF.EmitStoreOfScalar(CntVal, CntAddr, /*Volatile=*/false, 
/*Restrict=*/false,
+Int64Ty);
   llvm::Value *Args[] = {emitUpdateLocation(CGF, C->getLocStart()),
  getThreadID(CGF, C->getLocStart()),
  CntAddr.getPointer()};
Index: lib/CodeGen/CGExpr.cpp
===
--- lib/CodeGen/CGExpr.cpp
+++ lib/CodeGen/CGExpr.cpp
@@ -1372,9 +1372,9 @@
 }
 
 void CodeGenFunction::EmitStoreOfScalar(llvm::Value *Value, Address Addr,
-bool Volatile, QualType Ty,
-AlignmentSource AlignSource,
-llvm::MDNode *TBAAInfo,
+bool Volatile, bool Restrict,
+QualType Ty, AlignmentSource
+AlignSource, llvm::MDNode *TBAAInfo,
 bool isInit, QualType TBAABaseType,
 uint64_t TBAAOffset,
 bool isNontemporal) {
@@ -1405,9 +1405,11 @@
   // If this is an assignment to a restrict-qualified local variable, then we
   // have pointer aliasing assumptions that can be applied to the pointer value
   // being stored.
-  auto NAI = NoAliasAddrMap.find(Addr.getPointer());
-  if (NAI != NoAliasAddrMap.end())
-Value = Builder.CreateNoAliasPointer(Value, NAI->second);
+  if (Restrict) {
+auto NAI = NoAliasAddrMap.find(Addr.getPointer());
+if (NAI != NoAliasAddrMap.end())
+  Value = Builder.CreateNoAliasPointer(Value, NAI->second);
+  }
 
   LValue AtomicLValue =
   LValue::MakeAddr(Addr, Ty, getContext(), AlignSource, TBAAInfo);
@@ -1436,9 +1438,10 @@
 void CodeGenFunction::EmitStoreOfScalar(llvm::Value *value, LValue lvalue,
 bool isInit) {
   EmitStoreOfScalar(value, lvalue.getAddress(), lvalue.isVolatile(),
-lvalue.getType(), lvalue.getAlignmentSource(),
-lvalue.getTBAAInfo(), isInit, lvalue.getTBAABaseType(),
-lvalue.getTBAAOffset(), lvalue.isNontemporal());
+lvalue.isRestrictQualified(), lvalue.getType(),
+lvalue.getAlignmentSource(), lvalue.getTBAAInfo(), isInit,
+lvalue.getTBAABaseType(), lvalue.getTBAAOffset(),
+lvalue.isNontemporal());
 }
 
 /// EmitLoadOfLValue - Given an expression that represents a value lvalue, this
Index: lib/CodeGen/CGDeclCXX.cpp
===
--- lib/CodeGen/CGDeclCXX.cpp
+++ lib/CodeGen/CGDeclCXX.cpp
@@ -188,7 +188,7 @@
   assert(PerformInit && "cannot have constant initializer which needs "
  "destruction for reference");
   RValue RV = Em

[PATCH] D25225: Add an option to save the backend-produced YAML optimization record to a file

2016-10-03 Thread Hal Finkel via cfe-commits
hfinkel created this revision.
hfinkel added reviewers: anemet, rsmith, rjmccall.
hfinkel added a subscriber: cfe-commits.
Herald added a subscriber: mcrosier.

The backend now has the capability to save information from optimizations, the 
same information that can be used to generate optimization diagnostics but in 
machine-consumable form, into an output file. This can be enabled when using 
opt (see r282539), and this patch will enable it when using clang. The idea is 
that other tools will be able to consume these files, and perhaps in 
combination with the original source code, produce various kinds of 
optimization reports for users (and for compiler developers).

This patch proposes the name -fsave-optimization-record (and 
-fsave-optimization-record=filename). Bikeshedding welcome.


https://reviews.llvm.org/D25225

Files:
  include/clang/Driver/CC1Options.td
  include/clang/Driver/Options.td
  include/clang/Frontend/CodeGenOptions.h
  lib/CodeGen/CodeGenAction.cpp
  lib/Driver/Tools.cpp
  lib/Frontend/CompilerInvocation.cpp
  test/CodeGen/opt-record.c
  test/Driver/opt-record.c

Index: test/Driver/opt-record.c
===
--- /dev/null
+++ test/Driver/opt-record.c
@@ -0,0 +1,9 @@
+// RUN: %clang -### -S -o FOO -fsave-optimization-record %s 2>&1 | FileCheck %s
+// RUN: %clang -### -S -o FOO -fsave-optimization-record=BAR.txt %s 2>&1 | FileCheck %s -check-prefix=CHECK-EQ
+
+// CHECK: "-cc1"
+// CHECK: "-opt-record-file" "opt-record.yaml"
+
+// CHECK-EQ: "-cc1"
+// CHECK-EQ: "-opt-record-file" "BAR.txt"
+
Index: test/CodeGen/opt-record.c
===
--- /dev/null
+++ test/CodeGen/opt-record.c
@@ -0,0 +1,26 @@
+// RUN: %clang_cc1 -O3 -triple x86_64-unknown-linux-gnu -target-cpu x86-64 %s -o %t -dwarf-column-info -opt-record-file %t.yaml -emit-obj
+// RUN: cat %t.yaml | FileCheck %s
+// REQUIRES: x86-registered-target
+
+void bar();
+void foo() { bar(); }
+
+void Test(int *res, int *c, int *d, int *p, int n) {
+  int i;
+
+#pragma clang loop vectorize(assume_safety)
+  for (i = 0; i < 1600; i++) {
+res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
+  }
+}
+
+// CHECK: --- !Missed
+// CHECK: Pass:inline
+// CHECK: Name:NoDefinition
+// CHECK: Function:foo
+
+// CHECK: --- !Passed
+// CHECK: Pass:loop-vectorize
+// CHECK: Name:Vectorized
+// CHECK: Function:Test
+
Index: lib/Frontend/CompilerInvocation.cpp
===
--- lib/Frontend/CompilerInvocation.cpp
+++ lib/Frontend/CompilerInvocation.cpp
@@ -823,6 +823,10 @@
   Opts.LinkerOptions = Args.getAllArgValues(OPT_linker_option);
   bool NeedLocTracking = false;
 
+  Opts.OptRecordFile = Args.getLastArgValue(OPT_opt_record_file);
+  if (!Opts.OptRecordFile.empty())
+NeedLocTracking = true;
+
   if (Arg *A = Args.getLastArg(OPT_Rpass_EQ)) {
 Opts.OptimizationRemarkPattern =
 GenerateOptimizationRemarkRegex(Diags, Args, A);
Index: lib/Driver/Tools.cpp
===
--- lib/Driver/Tools.cpp
+++ lib/Driver/Tools.cpp
@@ -3336,23 +3336,29 @@
   }
 }
 
-static const char *SplitDebugName(const ArgList &Args, const InputInfo &Input) {
+static const char *getAltExtOutputName(const ArgList &Args,
+   const InputInfo &Input,
+   const char *Ext) {
   Arg *FinalOutput = Args.getLastArg(options::OPT_o);
   if (FinalOutput && Args.hasArg(options::OPT_c)) {
 SmallString<128> T(FinalOutput->getValue());
-llvm::sys::path::replace_extension(T, "dwo");
+llvm::sys::path::replace_extension(T, Ext);
 return Args.MakeArgString(T);
   } else {
 // Use the compilation dir.
 SmallString<128> T(
 Args.getLastArgValue(options::OPT_fdebug_compilation_dir));
 SmallString<128> F(llvm::sys::path::stem(Input.getBaseInput()));
-llvm::sys::path::replace_extension(F, "dwo");
+llvm::sys::path::replace_extension(F, Ext);
 T += F;
 return Args.MakeArgString(F);
   }
 }
 
+static const char *SplitDebugName(const ArgList &Args, const InputInfo &Input) {
+  return getAltExtOutputName(Args, Input, "dwo");
+}
+
 static void SplitDebugInfo(const ToolChain &TC, Compilation &C, const Tool &T,
const JobAction &JA, const ArgList &Args,
const InputInfo &Output, const char *OutFile) {
@@ -3377,6 +3383,10 @@
   C.addCommand(llvm::make_unique(JA, T, Exec, StripArgs, II));
 }
 
+static const char *getOptRecordName(const ArgList &Args, const InputInfo &Input) {
+  return getAltExtOutputName(Args, Input, "yaml");
+}
+
 /// \brief Vectorize at all optimization levels greater than 1 except for -Oz.
 /// For -Oz the loop vectorizer is disable, while the slp vectorizer is enabled.
 static bool shouldEnableVectorizerAtOLevel(const ArgList 

[PATCH] D25225: Add an option to save the backend-produced YAML optimization record to a file

2016-10-03 Thread Hal Finkel via cfe-commits
hfinkel added a dependency: D25224: Don't filter diagnostics written as YAML to 
the output file.
hfinkel added a comment.

Note: This depends on https://reviews.llvm.org/D25224.


https://reviews.llvm.org/D25225



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D19678: Annotated-source optimization reports (a.k.a. "listing" files)

2016-10-04 Thread Hal Finkel via cfe-commits
hfinkel abandoned this revision.
hfinkel added a comment.

Abandoned in favor of 
https://reviews.llvm.org/D25225/https://reviews.llvm.org/D25262.


https://reviews.llvm.org/D19678



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25225: Add an option to save the backend-produced YAML optimization record to a file

2016-10-05 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

@rsmith  @rjmccall - I chatted with @anemet about this on IRC, and he's happy 
with it. Please look this over, in part to make sure you're happy with the 
option name.

On the name, two of my thoughts behind using -fsave-optimization-record were: 
1) I did not want to call it a "report", because it is YAML output and not 
something for a human to use directly and 2) I thought that record, the noun, 
fit well, but not necessarily the verb, and by putting 'save' in the name it 
seems clear (at least to me) that record is the noun.


https://reviews.llvm.org/D25225



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25225: Add an option to save the backend-produced YAML optimization record to a file

2016-10-06 Thread Hal Finkel via cfe-commits
hfinkel updated this revision to Diff 73874.
hfinkel added a comment.
Herald added a subscriber: fhahn.

I reworked the way that the automatic file-name selection works so that it will 
work with offloading (e.g. CUDA), and added more tests for the automatic 
file-name selection. It now also bases the file name on the output file name if 
you use -S (not just -c).  When using an offload device, it makes the output 
file name for the device different from the host in a way that mirrors other 
parts of the driver.

I also changed the default extension from .yaml to .opt.yaml to make it a 
little more descriptive (there are obviously lots of different kinds of yaml 
files).


https://reviews.llvm.org/D25225

Files:
  include/clang/Driver/CC1Options.td
  include/clang/Driver/Options.td
  include/clang/Frontend/CodeGenOptions.h
  lib/CodeGen/CodeGenAction.cpp
  lib/Driver/Tools.cpp
  lib/Frontend/CompilerInvocation.cpp
  test/CodeGen/opt-record.c
  test/Driver/opt-record.c

Index: test/Driver/opt-record.c
===
--- /dev/null
+++ test/Driver/opt-record.c
@@ -0,0 +1,18 @@
+// RUN: %clang -### -S -o FOO -fsave-optimization-record %s 2>&1 | FileCheck %s
+// RUN: %clang -### -c -o FOO -fsave-optimization-record %s 2>&1 | FileCheck %s
+// RUN: %clang -### -c -fsave-optimization-record %s 2>&1 | FileCheck %s -check-prefix=CHECK-NO-O
+// RUN: %clang -### -fsave-optimization-record %s 2>&1 | FileCheck %s -check-prefix=CHECK-NO-O
+// RUN: %clang -### -S -fsave-optimization-record -x cuda -nocudainc -nocudalib %s 2>&1 | FileCheck %s -check-prefix=CHECK-NO-O -check-prefix=CHECK-CUDA-DEV
+// RUN: %clang -### -fsave-optimization-record -x cuda -nocudainc -nocudalib %s 2>&1 | FileCheck %s -check-prefix=CHECK-NO-O -check-prefix=CHECK-CUDA-DEV
+// RUN: %clang -### -S -o FOO -fsave-optimization-record=BAR.txt %s 2>&1 | FileCheck %s -check-prefix=CHECK-EQ
+
+// CHECK: "-cc1"
+// CHECK: "-opt-record-file" "FOO.opt.yaml"
+
+// CHECK-NO-O: "-cc1"
+// CHECK-NO-O-DAG: "-opt-record-file" "opt-record.opt.yaml"
+// CHECK-CUDA-DEV-DAG: "-opt-record-file" "opt-record-device-cuda-nvptx64-nvidia-cuda-sm_20.opt.yaml"
+
+// CHECK-EQ: "-cc1"
+// CHECK-EQ: "-opt-record-file" "BAR.txt"
+
Index: test/CodeGen/opt-record.c
===
--- /dev/null
+++ test/CodeGen/opt-record.c
@@ -0,0 +1,26 @@
+// RUN: %clang_cc1 -O3 -triple x86_64-unknown-linux-gnu -target-cpu x86-64 %s -o %t -dwarf-column-info -opt-record-file %t.yaml -emit-obj
+// RUN: cat %t.yaml | FileCheck %s
+// REQUIRES: x86-registered-target
+
+void bar();
+void foo() { bar(); }
+
+void Test(int *res, int *c, int *d, int *p, int n) {
+  int i;
+
+#pragma clang loop vectorize(assume_safety)
+  for (i = 0; i < 1600; i++) {
+res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
+  }
+}
+
+// CHECK: --- !Missed
+// CHECK: Pass:inline
+// CHECK: Name:NoDefinition
+// CHECK: Function:foo
+
+// CHECK: --- !Passed
+// CHECK: Pass:loop-vectorize
+// CHECK: Name:Vectorized
+// CHECK: Function:Test
+
Index: lib/Frontend/CompilerInvocation.cpp
===
--- lib/Frontend/CompilerInvocation.cpp
+++ lib/Frontend/CompilerInvocation.cpp
@@ -823,6 +823,10 @@
   Opts.LinkerOptions = Args.getAllArgValues(OPT_linker_option);
   bool NeedLocTracking = false;
 
+  Opts.OptRecordFile = Args.getLastArgValue(OPT_opt_record_file);
+  if (!Opts.OptRecordFile.empty())
+NeedLocTracking = true;
+
   if (Arg *A = Args.getLastArg(OPT_Rpass_EQ)) {
 Opts.OptimizationRemarkPattern =
 GenerateOptimizationRemarkRegex(Diags, Args, A);
Index: lib/Driver/Tools.cpp
===
--- lib/Driver/Tools.cpp
+++ lib/Driver/Tools.cpp
@@ -6068,6 +6068,40 @@
 CmdArgs.push_back("-fno-math-builtin");
   }
 
+  if (Args.hasFlag(options::OPT_fsave_optimization_record,
+   options::OPT_fsave_optimization_record_EQ,
+   options::OPT_fno_save_optimization_record, false)) {
+CmdArgs.push_back("-opt-record-file");
+
+const Arg *A = Args.getLastArg(options::OPT_fsave_optimization_record_EQ);
+if (A) {
+  CmdArgs.push_back(A->getValue());
+} else {
+  SmallString<128> F;
+  if (Output.isFilename() && (Args.hasArg(options::OPT_c) ||
+  Args.hasArg(options::OPT_S))) {
+F = Output.getFilename();
+  } else {
+// Use the compilation directory.
+F = llvm::sys::path::stem(Input.getBaseInput());
+
+// If we're compiling for an offload architecture (i.e. a CUDA device),
+// we need to make the file name for the device compilation different
+// from the host compilation.
+if (!JA.isDeviceOffloading(Action::OFK_None) &&
+!JA.isDeviceOffloading(Action::OFK_Host)) {
+  llvm::sys::path::re

[PATCH] D25225: Add an option to save the backend-produced YAML optimization record to a file

2016-10-07 Thread Hal Finkel via cfe-commits
hfinkel added inline comments.



Comment at: lib/CodeGen/CodeGenAction.cpp:198
+
+Ctx.setDiagnosticsOutputFile(new yaml::Output(OptRecordFile->os()));
+  }

anemet wrote:
> Sorry, one more thing: if PGO is available, I think we want to set 
> Ctx.setDiagnosticHotnessRequested as well.  Without that, you'd have to pass 
> -fsave-optimization-record and -fdiagnostics-show-hotness to get hotness info 
> into the YAML file which feels strange.  I am certainly fine if we do this 
> later but I wanted to bring it up since it's seems related.
I agree. We shouldn't require -fdiagnostics-show-hotness for that to work.



Comment at: test/CodeGen/opt-record.c:17-25
+// CHECK: --- !Missed
+// CHECK: Pass:inline
+// CHECK: Name:NoDefinition
+// CHECK: Function:foo
+
+// CHECK: --- !Passed
+// CHECK: Pass:loop-vectorize

anemet wrote:
> Wouldn't this be a good place to also check that we have -gline-tables-only 
> properly hooked up, i.e. CHECK for DebugLoc: as well?
Yes; will do.


https://reviews.llvm.org/D25225



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25225: Add an option to save the backend-produced YAML optimization record to a file

2016-10-07 Thread Hal Finkel via cfe-commits
hfinkel updated this revision to Diff 74001.
hfinkel added a comment.
Herald added a subscriber: mehdi_amini.

Addressed review comments (DebugLoc is tested, and we enable hotness 
computation when saving the optimization record and also using PGO).


https://reviews.llvm.org/D25225

Files:
  include/clang/Driver/CC1Options.td
  include/clang/Driver/Options.td
  include/clang/Frontend/CodeGenOptions.h
  lib/CodeGen/CodeGenAction.cpp
  lib/Driver/Tools.cpp
  lib/Frontend/CompilerInvocation.cpp
  test/CodeGen/Inputs/opt-record.proftext
  test/CodeGen/opt-record.c
  test/Driver/opt-record.c

Index: test/Driver/opt-record.c
===
--- /dev/null
+++ test/Driver/opt-record.c
@@ -0,0 +1,18 @@
+// RUN: %clang -### -S -o FOO -fsave-optimization-record %s 2>&1 | FileCheck %s
+// RUN: %clang -### -c -o FOO -fsave-optimization-record %s 2>&1 | FileCheck %s
+// RUN: %clang -### -c -fsave-optimization-record %s 2>&1 | FileCheck %s -check-prefix=CHECK-NO-O
+// RUN: %clang -### -fsave-optimization-record %s 2>&1 | FileCheck %s -check-prefix=CHECK-NO-O
+// RUN: %clang -### -S -fsave-optimization-record -x cuda -nocudainc -nocudalib %s 2>&1 | FileCheck %s -check-prefix=CHECK-NO-O -check-prefix=CHECK-CUDA-DEV
+// RUN: %clang -### -fsave-optimization-record -x cuda -nocudainc -nocudalib %s 2>&1 | FileCheck %s -check-prefix=CHECK-NO-O -check-prefix=CHECK-CUDA-DEV
+// RUN: %clang -### -S -o FOO -fsave-optimization-record=BAR.txt %s 2>&1 | FileCheck %s -check-prefix=CHECK-EQ
+
+// CHECK: "-cc1"
+// CHECK: "-opt-record-file" "FOO.opt.yaml"
+
+// CHECK-NO-O: "-cc1"
+// CHECK-NO-O-DAG: "-opt-record-file" "opt-record.opt.yaml"
+// CHECK-CUDA-DEV-DAG: "-opt-record-file" "opt-record-device-cuda-nvptx64-nvidia-cuda-sm_20.opt.yaml"
+
+// CHECK-EQ: "-cc1"
+// CHECK-EQ: "-opt-record-file" "BAR.txt"
+
Index: test/CodeGen/opt-record.c
===
--- /dev/null
+++ test/CodeGen/opt-record.c
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -O3 -triple x86_64-unknown-linux-gnu -target-cpu x86-64 %s -o %t -dwarf-column-info -opt-record-file %t.yaml -emit-obj
+// RUN: cat %t.yaml | FileCheck %s
+// RUN: llvm-profdata merge %S/Inputs/opt-record.proftext -o %t.profdata
+// RUN: %clang_cc1 -O3 -triple x86_64-unknown-linux-gnu -target-cpu x86-64 -fprofile-instrument-use-path=%t.profdata %s -o %t -dwarf-column-info -opt-record-file %t.yaml -emit-obj
+// RUN: cat %t.yaml | FileCheck -check-prefix=CHECK -check-prefix=CHECK-PGO %s
+// REQUIRES: x86-registered-target
+
+void bar();
+void foo() { bar(); }
+
+void Test(int *res, int *c, int *d, int *p, int n) {
+  int i;
+
+#pragma clang loop vectorize(assume_safety)
+  for (i = 0; i < 1600; i++) {
+res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
+  }
+}
+
+// CHECK: --- !Missed
+// CHECK: Pass:inline
+// CHECK: Name:NoDefinition
+// CHECK: DebugLoc:
+// CHECK: Function:foo
+// CHECK-PGO: Hotness:
+
+// CHECK: --- !Passed
+// CHECK: Pass:loop-vectorize
+// CHECK: Name:Vectorized
+// CHECK: DebugLoc:
+// CHECK: Function:Test
+// CHECK-PGO: Hotness:
+
Index: test/CodeGen/Inputs/opt-record.proftext
===
--- /dev/null
+++ test/CodeGen/Inputs/opt-record.proftext
@@ -0,0 +1,26 @@
+foo
+# Func Hash:
+0
+# Num Counters:
+1
+# Counter Values:
+30
+
+bar
+# Func Hash:
+0
+# Num Counters:
+1
+# Counter Values:
+30
+
+Test
+# Func Hash:
+269
+# Num Counters:
+3
+# Counter Values:
+1
+30
+15
+
Index: lib/Frontend/CompilerInvocation.cpp
===
--- lib/Frontend/CompilerInvocation.cpp
+++ lib/Frontend/CompilerInvocation.cpp
@@ -823,6 +823,10 @@
   Opts.LinkerOptions = Args.getAllArgValues(OPT_linker_option);
   bool NeedLocTracking = false;
 
+  Opts.OptRecordFile = Args.getLastArgValue(OPT_opt_record_file);
+  if (!Opts.OptRecordFile.empty())
+NeedLocTracking = true;
+
   if (Arg *A = Args.getLastArg(OPT_Rpass_EQ)) {
 Opts.OptimizationRemarkPattern =
 GenerateOptimizationRemarkRegex(Diags, Args, A);
Index: lib/Driver/Tools.cpp
===
--- lib/Driver/Tools.cpp
+++ lib/Driver/Tools.cpp
@@ -6068,6 +6068,40 @@
 CmdArgs.push_back("-fno-math-builtin");
   }
 
+  if (Args.hasFlag(options::OPT_fsave_optimization_record,
+   options::OPT_fsave_optimization_record_EQ,
+   options::OPT_fno_save_optimization_record, false)) {
+CmdArgs.push_back("-opt-record-file");
+
+const Arg *A = Args.getLastArg(options::OPT_fsave_optimization_record_EQ);
+if (A) {
+  CmdArgs.push_back(A->getValue());
+} else {
+  SmallString<128> F;
+  if (Output.isFilename() && (Args.hasArg(options::OPT_c) ||
+  Args.hasArg(options::OPT_S))) {
+F = Output.getFilename();
+  } else {
+   

[PATCH] D25387: When optimizing for size, enable loop rerolling by default.

2016-10-07 Thread Hal Finkel via cfe-commits
hfinkel created this revision.
hfinkel added a reviewer: jmolloy.
hfinkel added a subscriber: cfe-commits.
Herald added a subscriber: mcrosier.

We have a loop-rerolling optimization which can be enabled by using 
-freroll-loops. While sometimes loops are hand-unrolled for performance 
reasons, when optimizing for size, we should always undo this manual 
optimization to produce smaller code (our optimizer's unroller will still 
unroll the rerolled loops if it thinks that is a good idea).


https://reviews.llvm.org/D25387

Files:
  lib/Driver/Tools.cpp
  test/Driver/clang_f_opts.c


Index: test/Driver/clang_f_opts.c
===
--- test/Driver/clang_f_opts.c
+++ test/Driver/clang_f_opts.c
@@ -47,7 +47,12 @@
 // CHECK-NO-UNROLL-LOOPS: "-fno-unroll-loops"
 
 // RUN: %clang -### -S -freroll-loops %s 2>&1 | FileCheck 
-check-prefix=CHECK-REROLL-LOOPS %s
+// RUN: %clang -### -S -Os %s 2>&1 | FileCheck 
-check-prefix=CHECK-REROLL-LOOPS %s
+// RUN: %clang -### -S -Oz %s 2>&1 | FileCheck 
-check-prefix=CHECK-REROLL-LOOPS %s
 // RUN: %clang -### -S -fno-reroll-loops %s 2>&1 | FileCheck 
-check-prefix=CHECK-NO-REROLL-LOOPS %s
+// RUN: %clang -### -S -Os -fno-reroll-loops %s 2>&1 | FileCheck 
-check-prefix=CHECK-NO-REROLL-LOOPS %s
+// RUN: %clang -### -S -Oz -fno-reroll-loops %s 2>&1 | FileCheck 
-check-prefix=CHECK-NO-REROLL-LOOPS %s
+// RUN: %clang -### -S -O1 %s 2>&1 | FileCheck 
-check-prefix=CHECK-NO-REROLL-LOOPS %s
 // RUN: %clang -### -S -fno-reroll-loops -freroll-loops %s 2>&1 | FileCheck 
-check-prefix=CHECK-REROLL-LOOPS %s
 // RUN: %clang -### -S -freroll-loops -fno-reroll-loops %s 2>&1 | FileCheck 
-check-prefix=CHECK-NO-REROLL-LOOPS %s
 // CHECK-REROLL-LOOPS: "-freroll-loops"
Index: lib/Driver/Tools.cpp
===
--- lib/Driver/Tools.cpp
+++ lib/Driver/Tools.cpp
@@ -5226,9 +5226,18 @@
   }
 
   if (Arg *A = Args.getLastArg(options::OPT_freroll_loops,
-   options::OPT_fno_reroll_loops))
+   options::OPT_fno_reroll_loops)) {
 if (A->getOption().matches(options::OPT_freroll_loops))
   CmdArgs.push_back("-freroll-loops");
+  } else if (Arg *A = Args.getLastArg(options::OPT_O_Group)) {
+// If rerolling is not explicitly enabled or disabled, then enable when
+// optimizing for size.
+if (A->getOption().matches(options::OPT_O)) {
+  StringRef S(A->getValue());
+  if (S == "s" || S == "z")
+CmdArgs.push_back("-freroll-loops");
+}
+  }
 
   Args.AddLastArg(CmdArgs, options::OPT_fwritable_strings);
   Args.AddLastArg(CmdArgs, options::OPT_funroll_loops,


Index: test/Driver/clang_f_opts.c
===
--- test/Driver/clang_f_opts.c
+++ test/Driver/clang_f_opts.c
@@ -47,7 +47,12 @@
 // CHECK-NO-UNROLL-LOOPS: "-fno-unroll-loops"
 
 // RUN: %clang -### -S -freroll-loops %s 2>&1 | FileCheck -check-prefix=CHECK-REROLL-LOOPS %s
+// RUN: %clang -### -S -Os %s 2>&1 | FileCheck -check-prefix=CHECK-REROLL-LOOPS %s
+// RUN: %clang -### -S -Oz %s 2>&1 | FileCheck -check-prefix=CHECK-REROLL-LOOPS %s
 // RUN: %clang -### -S -fno-reroll-loops %s 2>&1 | FileCheck -check-prefix=CHECK-NO-REROLL-LOOPS %s
+// RUN: %clang -### -S -Os -fno-reroll-loops %s 2>&1 | FileCheck -check-prefix=CHECK-NO-REROLL-LOOPS %s
+// RUN: %clang -### -S -Oz -fno-reroll-loops %s 2>&1 | FileCheck -check-prefix=CHECK-NO-REROLL-LOOPS %s
+// RUN: %clang -### -S -O1 %s 2>&1 | FileCheck -check-prefix=CHECK-NO-REROLL-LOOPS %s
 // RUN: %clang -### -S -fno-reroll-loops -freroll-loops %s 2>&1 | FileCheck -check-prefix=CHECK-REROLL-LOOPS %s
 // RUN: %clang -### -S -freroll-loops -fno-reroll-loops %s 2>&1 | FileCheck -check-prefix=CHECK-NO-REROLL-LOOPS %s
 // CHECK-REROLL-LOOPS: "-freroll-loops"
Index: lib/Driver/Tools.cpp
===
--- lib/Driver/Tools.cpp
+++ lib/Driver/Tools.cpp
@@ -5226,9 +5226,18 @@
   }
 
   if (Arg *A = Args.getLastArg(options::OPT_freroll_loops,
-   options::OPT_fno_reroll_loops))
+   options::OPT_fno_reroll_loops)) {
 if (A->getOption().matches(options::OPT_freroll_loops))
   CmdArgs.push_back("-freroll-loops");
+  } else if (Arg *A = Args.getLastArg(options::OPT_O_Group)) {
+// If rerolling is not explicitly enabled or disabled, then enable when
+// optimizing for size.
+if (A->getOption().matches(options::OPT_O)) {
+  StringRef S(A->getValue());
+  if (S == "s" || S == "z")
+CmdArgs.push_back("-freroll-loops");
+}
+  }
 
   Args.AddLastArg(CmdArgs, options::OPT_fwritable_strings);
   Args.AddLastArg(CmdArgs, options::OPT_funroll_loops,
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r283680 - [CUDA] Support and std::min/max on the device.

2016-10-08 Thread Hal Finkel via cfe-commits
Hi Justin,

This is neat!

I see a bunch of uses of std::isinf, etc. here. It tends to be important that, 
when using -ffast-math (or -ffinite-math-only) these checks get optimized away. 
Can you please check that they do? If not, you might mirror what I've done in 
r283051 for libc++, which is similar to what libstdc++ ends up doing, so that 
we use __builtin_isnan/isinf/isfinite.

Thanks again,
Hal

- Original Message -
> From: "Justin Lebar via cfe-commits" 
> To: cfe-commits@lists.llvm.org
> Sent: Saturday, October 8, 2016 5:16:13 PM
> Subject: r283680 - [CUDA] Support  and std::min/max on the device.
> 
> Author: jlebar
> Date: Sat Oct  8 17:16:12 2016
> New Revision: 283680
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=283680&view=rev
> Log:
> [CUDA] Support  and std::min/max on the device.
> 
> Summary:
> We do this by wrapping  and .
> 
> Tests are in the test-suite.
> 
> Reviewers: tra
> 
> Subscribers: jhen, beanz, cfe-commits, mgorny
> 
> Differential Revision: https://reviews.llvm.org/D24979
> 
> Added:
> cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h
> cfe/trunk/lib/Headers/cuda_wrappers/
> cfe/trunk/lib/Headers/cuda_wrappers/algorithm
> cfe/trunk/lib/Headers/cuda_wrappers/complex
> Modified:
> cfe/trunk/lib/Driver/ToolChains.cpp
> cfe/trunk/lib/Headers/CMakeLists.txt
> cfe/trunk/lib/Headers/__clang_cuda_runtime_wrapper.h
> 
> Modified: cfe/trunk/lib/Driver/ToolChains.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains.cpp?rev=283680&r1=283679&r2=283680&view=diff
> ==
> --- cfe/trunk/lib/Driver/ToolChains.cpp (original)
> +++ cfe/trunk/lib/Driver/ToolChains.cpp Sat Oct  8 17:16:12 2016
> @@ -4694,6 +4694,15 @@ void Linux::AddClangCXXStdlibIncludeArgs
>  
>  void Linux::AddCudaIncludeArgs(const ArgList &DriverArgs,
> ArgStringList &CC1Args) const {
> +  if (!DriverArgs.hasArg(options::OPT_nobuiltininc)) {
> +// Add cuda_wrappers/* to our system include path.  This lets us
> wrap
> +// standard library headers.
> +SmallString<128> P(getDriver().ResourceDir);
> +llvm::sys::path::append(P, "include");
> +llvm::sys::path::append(P, "cuda_wrappers");
> +addSystemInclude(DriverArgs, CC1Args, P);
> +  }
> +
>if (DriverArgs.hasArg(options::OPT_nocudainc))
>  return;
>  
> 
> Modified: cfe/trunk/lib/Headers/CMakeLists.txt
> URL:
> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/CMakeLists.txt?rev=283680&r1=283679&r2=283680&view=diff
> ==
> --- cfe/trunk/lib/Headers/CMakeLists.txt (original)
> +++ cfe/trunk/lib/Headers/CMakeLists.txt Sat Oct  8 17:16:12 2016
> @@ -24,10 +24,13 @@ set(files
>bmiintrin.h
>__clang_cuda_builtin_vars.h
>__clang_cuda_cmath.h
> +  __clang_cuda_complex_builtins.h
>__clang_cuda_intrinsics.h
>__clang_cuda_math_forward_declares.h
>__clang_cuda_runtime_wrapper.h
>cpuid.h
> +  cuda_wrappers/algorithm
> +  cuda_wrappers/complex
>clflushoptintrin.h
>emmintrin.h
>f16cintrin.h
> 
> Added: cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h
> URL:
> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h?rev=283680&view=auto
> ==
> --- cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h (added)
> +++ cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h Sat Oct  8
> 17:16:12 2016
> @@ -0,0 +1,203 @@
> +/*===-- __clang_cuda_complex_builtins - CUDA impls of runtime
> complex fns ---===
> + *
> + * Permission is hereby granted, free of charge, to any person
> obtaining a copy
> + * of this software and associated documentation files (the
> "Software"), to deal
> + * in the Software without restriction, including without limitation
> the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense,
> and/or sell
> + * copies of the Software, and to permit persons to whom the
> Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be
> included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
> SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS IN
> + * THE SOFTWARE.
> + *
> +
> *===---===
> + */
> +
> +#ifndef __CLANG_CUDA_COM

Re: [libcxx] r283659 - [cmake] Split linked libraries into private & public, for linker script

2016-10-08 Thread Hal Finkel via cfe-commits
Hi Michal,

All of the libc++ and libc++abi regression tests are now failing on my Linux 
build system with this error:

  /usr/bin/ld: cannot find -lcxxabi_shared

My build directory has only these:

lib/libc++.a
lib/libc++abi.so
lib/libc++abi.so.1
lib/libc++abi.so.1.0
lib/libc++experimental.a
lib/libc++.so
lib/libc++.so.1
lib/libc++.so.1.0

 -Hal

- Original Message -
> From: "Michal Gorny via cfe-commits" 
> To: cfe-commits@lists.llvm.org
> Sent: Saturday, October 8, 2016 5:27:46 AM
> Subject: [libcxx] r283659 - [cmake] Split linked libraries into private & 
> public, for linker script
> 
> Author: mgorny
> Date: Sat Oct  8 05:27:45 2016
> New Revision: 283659
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=283659&view=rev
> Log:
> [cmake] Split linked libraries into private & public, for linker
> script
> 
> Introduce LIBCXX_LIBRARIES_PUBLIC in addition to LIBCXX_LIBRARIES
> that
> holds 'public' interface libraries -- that is, libraries that both
> libc++ links to and programs linked against it need to link to.
> 
> Currently this includes the ABI library and optionally -lunwind (when
> LIBCXXABI_USE_LLVM_UNWINDER is on). The libraries are included in the
> linker script, in order to make it possible to link C++ programs
> using
> clang with compiler-rt runtime out-of-the-box.
> 
> Differential Revision: https://reviews.llvm.org/D25008
> 
> Modified:
> libcxx/trunk/CMakeLists.txt
> libcxx/trunk/lib/CMakeLists.txt
> libcxx/trunk/utils/gen_link_script/gen_link_script.py
> 
> Modified: libcxx/trunk/CMakeLists.txt
> URL:
> http://llvm.org/viewvc/llvm-project/libcxx/trunk/CMakeLists.txt?rev=283659&r1=283658&r2=283659&view=diff
> ==
> --- libcxx/trunk/CMakeLists.txt (original)
> +++ libcxx/trunk/CMakeLists.txt Sat Oct  8 05:27:45 2016
> @@ -270,9 +270,13 @@ set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${LIB
>  # LIBCXX_CXX_FLAGS: General flags for both the compiler and linker.
>  # LIBCXX_COMPILE_FLAGS: Compile only flags.
>  # LIBCXX_LINK_FLAGS: Linker only flags.
> +# LIBCXX_LIBRARIES: Private libraries libc++ is linked to.
> +# LIBCXX_LIBRARIES_PUBLIC: Public libraries libc++ is linked to,
> +#  also exposed in the linker script.
>  set(LIBCXX_COMPILE_FLAGS "")
>  set(LIBCXX_LINK_FLAGS "")
>  set(LIBCXX_LIBRARIES "")
> +set(LIBCXX_LIBRARIES_PUBLIC "")
>  
>  # Include macros for adding and removing libc++ flags.
>  include(HandleLibcxxFlags)
> 
> Modified: libcxx/trunk/lib/CMakeLists.txt
> URL:
> http://llvm.org/viewvc/llvm-project/libcxx/trunk/lib/CMakeLists.txt?rev=283659&r1=283658&r2=283659&view=diff
> ==
> --- libcxx/trunk/lib/CMakeLists.txt (original)
> +++ libcxx/trunk/lib/CMakeLists.txt Sat Oct  8 05:27:45 2016
> @@ -33,9 +33,17 @@ add_link_flags_if(LIBCXX_CXX_ABI_LIBRARY
>  
>  add_library_flags_if(LIBCXX_COVERAGE_LIBRARY
>  "${LIBCXX_COVERAGE_LIBRARY}")
>  
> -add_library_flags_if(LIBCXX_ENABLE_STATIC_ABI_LIBRARY
> "-Wl,--whole-archive" "-Wl,-Bstatic")
> -add_library_flags("${LIBCXX_CXX_ABI_LIBRARY}")
> -add_library_flags_if(LIBCXX_ENABLE_STATIC_ABI_LIBRARY
> "-Wl,-Bdynamic" "-Wl,--no-whole-archive")
> +if (LIBCXX_ENABLE_STATIC_ABI_LIBRARY)
> +  add_library_flags("-Wl,--whole-archive" "-Wl,-Bstatic")
> +  add_library_flags("${LIBCXX_CXX_ABI_LIBRARY}")
> +  add_library_flags("-Wl,-Bdynamic" "-Wl,--no-whole-archive")
> +elseif (APPLE AND (LIBCXX_CXX_ABI_LIBNAME STREQUAL "libcxxabi" OR
> +   LIBCXX_CXX_ABI_LIBNAME STREQUAL "none"))
> +  # Apple re-exports libc++abi in libc++, so don't make it public
> +  add_library_flags("${LIBCXX_CXX_ABI_LIBRARY}")
> +else()
> +  list(APPEND LIBCXX_LIBRARIES_PUBLIC "${LIBCXX_CXX_ABI_LIBRARY}")
> +endif()
>  
>  if (APPLE AND LLVM_USE_SANITIZER)
>if (("${LLVM_USE_SANITIZER}" STREQUAL "Address") OR
> @@ -67,7 +75,7 @@ if (APPLE AND LLVM_USE_SANITIZER)
>endif()
>  endif()
>  
> -# Generate library list.
> +# Generate private library list.
>  add_library_flags_if(LIBCXX_HAS_PTHREAD_LIB pthread)
>  add_library_flags_if(LIBCXX_HAS_C_LIB c)
>  add_library_flags_if(LIBCXX_HAS_M_LIB m)
> @@ -75,6 +83,11 @@ add_library_flags_if(LIBCXX_HAS_RT_LIB r
>  add_library_flags_if(LIBCXX_HAS_GCC_S_LIB gcc_s)
>  add_library_flags_if(LIBCXX_HAVE_CXX_ATOMICS_WITH_LIB atomic)
>  
> +# Add the unwinder library.
> +if (LIBCXXABI_USE_LLVM_UNWINDER)
> +  list(APPEND LIBCXX_LIBRARIES_PUBLIC unwind)
> +endif()
> +
>  # Setup flags.
>  if (NOT WIN32)
>add_flags_if_supported(-fPIC)
> @@ -151,7 +164,9 @@ set(LIBCXX_TARGETS)
>  # Build the shared library.
>  if (LIBCXX_ENABLE_SHARED)
>add_library(cxx_shared SHARED $)
> -  target_link_libraries(cxx_shared ${LIBCXX_LIBRARIES})
> +  target_link_libraries(cxx_shared
> +PRIVATE ${LIBCXX_LIBRARIES}
> +PUBLIC ${LIBCXX_LIBRARIES_PUBLIC})
>set_target_properties(cxx_shared
>  PROPERTIES
>LINK_FLAGS"${LIBCXX_LINK_

Re: r283680 - [CUDA] Support and std::min/max on the device.

2016-10-08 Thread Hal Finkel via cfe-commits
- Original Message -
> From: "Justin Lebar" 
> To: "Hal Finkel" 
> Cc: "Clang Commits" 
> Sent: Saturday, October 8, 2016 6:16:12 PM
> Subject: Re: r283680 - [CUDA] Support  and std::min/max on the 
> device.
> 
> Hal,
> 
> On NVPTX, these functions eventually get resolved to function calls
> in
> libdevice, e.g. __nv_isinff and __nv_isnanf.
> 
> llvm does not do a good job understanding the body of e.g.
> __nvvm_isnanf, because it uses nvptx-specific intrinsic functions,
> notably @llvm.nvvm.fabs.f.  These are opaque to the LLVM optimizer.
> 
> The fix is not as simple as simply changing our implementation of
> e.g.
> std::isnan to call __builtin_isnanf, because we also would want to
> fix
> ::isnanf,

No, if I understand what you're saying, you specifically wouldn't. We had a 
discussion about this on the review thread(s) that led to r283051, and while we 
want to elide the checks inside the mathematical functions, we don't want to 
replace isnan itself with something that will get optimized away. We want to 
keep the ability for the user to explicitly check for NaNs, etc. even if we 
don't want those checks to appear inside of mathematical operations. This is 
important for use cases where, for example, even though the user might want 
fast math, they still need to check their inputs for NaNs.

 -Hal

> but we can't override that implementation without some
> major
> surgery on the nvptx headers.
> 
> David Majnemer and I talked about one way to fix this, namely by
> using
> IR intrinsic upgrades to replace the opaque nvptx intrinsics with
> LLVM
> intrinsics.  LLVM would then be able to understand these intrinsics
> and optimize them.  We would reap benefits not just for std::isnan,
> but also e.g. constant-folding calls like std::abs that also
> eventually end up in libnvvm.
> 
> I did the first half of this work, by adding lowerings for the
> various
> LLVM intrinsics to the NVPTX backend [1].  But David is now busy with
> other things and hasn't been able to help with the second half,
> namely
> using IR upgrades to replace the nvptx target-specific intrinsics
> with
> generalized LLVM intrinsics.  Perhaps this is something you'd be able
> to help with?
> 
> In any case, using builtins here without fixing std::isnan and
> ::isnan
> feels to me to be the wrong solution.  It seems to me that we should
> be able to rely on std::isnan and friends being fast, and if they're
> not, we should fix that.  Using builtins here would be "cheating" to
> make our implementation faster than user code.
> 
> I'll note, separately, that on x86, clang does not seem to
> constant-fold std::isinf or __builtin_isinff to false with
> -ffast-math
> -ffinite-math-only.  GCC can do it.  Clang gets std::isnan.
> https://godbolt.org/g/vZB55a
> 
> By the way, the changes you made to libc++ unfortunately break this
> patch with libc++, because e.g. __libcpp_isnan is not a device
> function.  I'll have to think about how to fix that -- I may send you
> a patch.
> 
> Regards,
> -Justin
> 
> [1] https://reviews.llvm.org/D24300
> 
> On Sat, Oct 8, 2016 at 3:36 PM, Hal Finkel  wrote:
> > Hi Justin,
> >
> > This is neat!
> >
> > I see a bunch of uses of std::isinf, etc. here. It tends to be
> > important that, when using -ffast-math (or -ffinite-math-only)
> > these checks get optimized away. Can you please check that they
> > do? If not, you might mirror what I've done in r283051 for libc++,
> > which is similar to what libstdc++ ends up doing, so that we use
> > __builtin_isnan/isinf/isfinite.
> >
> > Thanks again,
> > Hal
> >
> > - Original Message -
> >> From: "Justin Lebar via cfe-commits" 
> >> To: cfe-commits@lists.llvm.org
> >> Sent: Saturday, October 8, 2016 5:16:13 PM
> >> Subject: r283680 - [CUDA] Support  and std::min/max on
> >> the device.
> >>
> >> Author: jlebar
> >> Date: Sat Oct  8 17:16:12 2016
> >> New Revision: 283680
> >>
> >> URL: http://llvm.org/viewvc/llvm-project?rev=283680&view=rev
> >> Log:
> >> [CUDA] Support  and std::min/max on the device.
> >>
> >> Summary:
> >> We do this by wrapping  and .
> >>
> >> Tests are in the test-suite.
> >>
> >> Reviewers: tra
> >>
> >> Subscribers: jhen, beanz, cfe-commits, mgorny
> >>
> >> Differential Revision: https://reviews.llvm.org/D24979
> >>
> >> Added:
> >> cfe/trunk/lib/Headers/__clang_cuda_complex_builtins.h
> >> cfe/trunk/lib/Headers/cuda_wrappers/
> >> cfe/trunk/lib/Headers/cuda_wrappers/algorithm
> >> cfe/trunk/lib/Headers/cuda_wrappers/complex
> >> Modified:
> >> cfe/trunk/lib/Driver/ToolChains.cpp
> >> cfe/trunk/lib/Headers/CMakeLists.txt
> >> cfe/trunk/lib/Headers/__clang_cuda_runtime_wrapper.h
> >>
> >> Modified: cfe/trunk/lib/Driver/ToolChains.cpp
> >> URL:
> >> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains.cpp?rev=283680&r1=283679&r2=283680&view=diff
> >> ==
> >> --- cfe/trunk/lib/Driver/ToolChains.cpp (original)
> 

Re: [libcxx] r283659 - [cmake] Split linked libraries into private & public, for linker script

2016-10-08 Thread Hal Finkel via cfe-commits
- Original Message -
> From: "Hal Finkel via cfe-commits" 
> To: "Michal Gorny" 
> Cc: cfe-commits@lists.llvm.org
> Sent: Saturday, October 8, 2016 6:37:50 PM
> Subject: Re: [libcxx] r283659 - [cmake] Split linked libraries into private & 
> public, for linker script
> 
> Hi Michal,
> 
> All of the libc++ and libc++abi regression tests are now failing on
> my Linux build system with this error:
> 
>   /usr/bin/ld: cannot find -lcxxabi_shared

Alright, the problem here is that you've assumed, by doing this:

> >${LIBCXX_SOURCE_DIR}/utils/gen_link_script/gen_link_script.py
> >  ARGS
> >"$"
> > -  "${SCRIPT_ABI_LIBNAME}"
> > +  "\"${LIBCXX_LIBRARIES_PUBLIC}\""

that all of the items in the LIBCXX_LIBRARIES_PUBLIC list are actual library 
names. They might not be, but rather, they might be CMake targets instead:

elseif ("${LIBCXX_CXX_ABI_LIBNAME}" STREQUAL "libcxxabi")
  if (LIBCXX_CXX_ABI_INTREE)
# Link against just-built "cxxabi" target.
if (LIBCXX_ENABLE_STATIC_ABI_LIBRARY)
set(CXXABI_LIBNAME cxxabi_static)
else()
set(CXXABI_LIBNAME cxxabi_shared)
endif()
set(LIBCXX_LIBCPPABI_VERSION "2" PARENT_SCOPE)
  else()
# Assume c++abi is installed in the system, rely on -lc++abi link flag.
set(CXXABI_LIBNAME "c++abi")
  endif()

So we might have cxxabi_static or cxxabi_shared in the list.

 -Hal

> 
> My build directory has only these:
> 
> lib/libc++.a
> lib/libc++abi.so
> lib/libc++abi.so.1
> lib/libc++abi.so.1.0
> lib/libc++experimental.a
> lib/libc++.so
> lib/libc++.so.1
> lib/libc++.so.1.0
> 
>  -Hal
> 
> - Original Message -
> > From: "Michal Gorny via cfe-commits" 
> > To: cfe-commits@lists.llvm.org
> > Sent: Saturday, October 8, 2016 5:27:46 AM
> > Subject: [libcxx] r283659 - [cmake] Split linked libraries into
> > private & public, for linker script
> > 
> > Author: mgorny
> > Date: Sat Oct  8 05:27:45 2016
> > New Revision: 283659
> > 
> > URL: http://llvm.org/viewvc/llvm-project?rev=283659&view=rev
> > Log:
> > [cmake] Split linked libraries into private & public, for linker
> > script
> > 
> > Introduce LIBCXX_LIBRARIES_PUBLIC in addition to LIBCXX_LIBRARIES
> > that
> > holds 'public' interface libraries -- that is, libraries that both
> > libc++ links to and programs linked against it need to link to.
> > 
> > Currently this includes the ABI library and optionally -lunwind
> > (when
> > LIBCXXABI_USE_LLVM_UNWINDER is on). The libraries are included in
> > the
> > linker script, in order to make it possible to link C++ programs
> > using
> > clang with compiler-rt runtime out-of-the-box.
> > 
> > Differential Revision: https://reviews.llvm.org/D25008
> > 
> > Modified:
> > libcxx/trunk/CMakeLists.txt
> > libcxx/trunk/lib/CMakeLists.txt
> > libcxx/trunk/utils/gen_link_script/gen_link_script.py
> > 
> > Modified: libcxx/trunk/CMakeLists.txt
> > URL:
> > http://llvm.org/viewvc/llvm-project/libcxx/trunk/CMakeLists.txt?rev=283659&r1=283658&r2=283659&view=diff
> > ==
> > --- libcxx/trunk/CMakeLists.txt (original)
> > +++ libcxx/trunk/CMakeLists.txt Sat Oct  8 05:27:45 2016
> > @@ -270,9 +270,13 @@ set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${LIB
> >  # LIBCXX_CXX_FLAGS: General flags for both the compiler and
> >  linker.
> >  # LIBCXX_COMPILE_FLAGS: Compile only flags.
> >  # LIBCXX_LINK_FLAGS: Linker only flags.
> > +# LIBCXX_LIBRARIES: Private libraries libc++ is linked to.
> > +# LIBCXX_LIBRARIES_PUBLIC: Public libraries libc++ is linked to,
> > +#  also exposed in the linker script.
> >  set(LIBCXX_COMPILE_FLAGS "")
> >  set(LIBCXX_LINK_FLAGS "")
> >  set(LIBCXX_LIBRARIES "")
> > +set(LIBCXX_LIBRARIES_PUBLIC "")
> >  
> >  # Include macros for adding and removing libc++ flags.
> >  include(HandleLibcxxFlags)
> > 
> > Modified: libcxx/trunk/lib/CMakeLists.txt
> > URL:
> > http://llvm.org/viewvc/llvm-project/libcxx/trunk/lib/CMakeLists.txt?rev=283659&r1=283658&r2=283659&view=diff
> > ==
> > --- libcxx/trunk/lib/CMakeLists.txt (original)
> > +++ libcxx/trunk/lib/CMakeLists.txt Sat Oct  8 05:27:45 2016
> >

Re: [libcxx] r283659 - [cmake] Split linked libraries into private & public, for linker script

2016-10-08 Thread Hal Finkel via cfe-commits
- Original Message -
> From: "Hal Finkel" 
> To: "Hal Finkel" 
> Cc: cfe-commits@lists.llvm.org, "Michal Gorny" 
> Sent: Saturday, October 8, 2016 9:34:58 PM
> Subject: Re: [libcxx] r283659 - [cmake] Split linked libraries into private & 
> public, for linker script
> 
> - Original Message -
> > From: "Hal Finkel via cfe-commits" 
> > To: "Michal Gorny" 
> > Cc: cfe-commits@lists.llvm.org
> > Sent: Saturday, October 8, 2016 6:37:50 PM
> > Subject: Re: [libcxx] r283659 - [cmake] Split linked libraries into
> > private & public, for linker script
> > 
> > Hi Michal,
> > 
> > All of the libc++ and libc++abi regression tests are now failing on
> > my Linux build system with this error:
> > 
> >   /usr/bin/ld: cannot find -lcxxabi_shared
> 
> Alright, the problem here is that you've assumed, by doing this:
> 
> > >${LIBCXX_SOURCE_DIR}/utils/gen_link_script/gen_link_script.py
> > >  ARGS
> > >"$"
> > > -  "${SCRIPT_ABI_LIBNAME}"
> > > +  "\"${LIBCXX_LIBRARIES_PUBLIC}\""

Ah, I see. The code right about this used to handle this situation:

  # Get the name of the ABI library and handle the case where CXXABI_LIBNAME
  # is a target name and not a library. Ex cxxabi_shared.
  set(SCRIPT_ABI_LIBNAME "${LIBCXX_CXX_ABI_LIBRARY}")
  if (SCRIPT_ABI_LIBNAME STREQUAL "cxxabi_shared")
set(SCRIPT_ABI_LIBNAME "c++abi")
  endif()

and now it doesn't because you no longer use SCRIPT_ABI_LIBNAME as the argument 
to gen_link_script.py. Let me see if I can fix this...

 -Hal

> 
> that all of the items in the LIBCXX_LIBRARIES_PUBLIC list are actual
> library names. They might not be, but rather, they might be CMake
> targets instead:
> 
> elseif ("${LIBCXX_CXX_ABI_LIBNAME}" STREQUAL "libcxxabi")
>   if (LIBCXX_CXX_ABI_INTREE)
> # Link against just-built "cxxabi" target.
> if (LIBCXX_ENABLE_STATIC_ABI_LIBRARY)
> set(CXXABI_LIBNAME cxxabi_static)
> else()
> set(CXXABI_LIBNAME cxxabi_shared)
> endif()
> set(LIBCXX_LIBCPPABI_VERSION "2" PARENT_SCOPE)
>   else()
> # Assume c++abi is installed in the system, rely on -lc++abi link
> flag.
> set(CXXABI_LIBNAME "c++abi")
>   endif()
> 
> So we might have cxxabi_static or cxxabi_shared in the list.
> 
>  -Hal
> 
> > 
> > My build directory has only these:
> > 
> > lib/libc++.a
> > lib/libc++abi.so
> > lib/libc++abi.so.1
> > lib/libc++abi.so.1.0
> > lib/libc++experimental.a
> > lib/libc++.so
> > lib/libc++.so.1
> > lib/libc++.so.1.0
> > 
> >  -Hal
> > 
> > - Original Message -
> > > From: "Michal Gorny via cfe-commits" 
> > > To: cfe-commits@lists.llvm.org
> > > Sent: Saturday, October 8, 2016 5:27:46 AM
> > > Subject: [libcxx] r283659 - [cmake] Split linked libraries into
> > > private & public, for linker script
> > > 
> > > Author: mgorny
> > > Date: Sat Oct  8 05:27:45 2016
> > > New Revision: 283659
> > > 
> > > URL: http://llvm.org/viewvc/llvm-project?rev=283659&view=rev
> > > Log:
> > > [cmake] Split linked libraries into private & public, for linker
> > > script
> > > 
> > > Introduce LIBCXX_LIBRARIES_PUBLIC in addition to LIBCXX_LIBRARIES
> > > that
> > > holds 'public' interface libraries -- that is, libraries that
> > > both
> > > libc++ links to and programs linked against it need to link to.
> > > 
> > > Currently this includes the ABI library and optionally -lunwind
> > > (when
> > > LIBCXXABI_USE_LLVM_UNWINDER is on). The libraries are included in
> > > the
> > > linker script, in order to make it possible to link C++ programs
> > > using
> > > clang with compiler-rt runtime out-of-the-box.
> > > 
> > > Differential Revision: https://reviews.llvm.org/D25008
> > > 
> > > Modified:
> > > libcxx/trunk/CMakeLists.txt
> > > libcxx/trunk/lib/CMakeLists.txt
> > > libcxx/trunk/utils/gen_link_script/gen_link_script.py
> > > 
> > > Modified: libcxx/trunk/CMakeLists.txt
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/libcxx/trunk/CMakeLists.txt?rev=283659&r1=283658&r2=283659&view=diff
> > > ==

[libcxx] r283684 - [CMake] Fix in-tree libcxxabi build support after r283659

2016-10-08 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Sat Oct  8 21:49:31 2016
New Revision: 283684

URL: http://llvm.org/viewvc/llvm-project?rev=283684&view=rev
Log:
[CMake] Fix in-tree libcxxabi build support after r283659

r283659 changed the argument to gen_link_script.py from SCRIPT_ABI_LIBNAME to
LIBCXX_LIBRARIES_PUBLIC, assuming that all of the items in the
LIBCXX_LIBRARIES_PUBLIC list were library names. This is not right, however,
for in-tree libcxxabi builds, we might have the target name in this list. There
was special logic to fixup SCRIPT_ABI_LIBNAME for this situation; change it to
apply a similar fixup for LIBCXX_LIBRARIES_PUBLIC.

Modified:
libcxx/trunk/lib/CMakeLists.txt

Modified: libcxx/trunk/lib/CMakeLists.txt
URL: 
http://llvm.org/viewvc/llvm-project/libcxx/trunk/lib/CMakeLists.txt?rev=283684&r1=283683&r2=283684&view=diff
==
--- libcxx/trunk/lib/CMakeLists.txt (original)
+++ libcxx/trunk/lib/CMakeLists.txt Sat Oct  8 21:49:31 2016
@@ -244,10 +244,15 @@ endif()
 if (LIBCXX_ENABLE_SHARED AND LIBCXX_ENABLE_ABI_LINKER_SCRIPT)
   # Get the name of the ABI library and handle the case where CXXABI_LIBNAME
   # is a target name and not a library. Ex cxxabi_shared.
-  set(SCRIPT_ABI_LIBNAME "${LIBCXX_CXX_ABI_LIBRARY}")
-  if (SCRIPT_ABI_LIBNAME STREQUAL "cxxabi_shared")
-set(SCRIPT_ABI_LIBNAME "c++abi")
-  endif()
+  set(LIBCXX_LIBRARIES_PUBLIC_NAMES)
+  foreach(lib ${LIBCXX_LIBRARIES_PUBLIC})
+if (lib STREQUAL "cxxabi_shared")
+  list(APPEND LIBCXX_LIBRARIES_PUBLIC_NAMES "c++abi")
+else()
+  list(APPEND LIBCXX_LIBRARIES_PUBLIC_NAMES "${lib}")
+endif()
+  endforeach()
+
   # Generate a linker script inplace of a libc++.so symlink. Rerun this command
   # after cxx builds.
   add_custom_command(TARGET cxx_shared POST_BUILD
@@ -255,7 +260,7 @@ if (LIBCXX_ENABLE_SHARED AND LIBCXX_ENAB
   ${PYTHON_EXECUTABLE} 
${LIBCXX_SOURCE_DIR}/utils/gen_link_script/gen_link_script.py
 ARGS
   "$"
-  "\"${LIBCXX_LIBRARIES_PUBLIC}\""
+  "\"${LIBCXX_LIBRARIES_PUBLIC_NAMES}\""
 WORKING_DIRECTORY ${LIBCXX_BUILD_DIR}
   )
 endif()


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [libcxx] r283659 - [cmake] Split linked libraries into private & public, for linker script

2016-10-08 Thread Hal Finkel via cfe-commits
- Original Message -
> From: "Hal Finkel" 
> To: "Hal Finkel" 
> Cc: cfe-commits@lists.llvm.org, "Michal Gorny" 
> Sent: Saturday, October 8, 2016 9:43:02 PM
> Subject: Re: [libcxx] r283659 - [cmake] Split linked libraries into private & 
> public, for linker script
> 
> - Original Message -
> > From: "Hal Finkel" 
> > To: "Hal Finkel" 
> > Cc: cfe-commits@lists.llvm.org, "Michal Gorny" 
> > Sent: Saturday, October 8, 2016 9:34:58 PM
> > Subject: Re: [libcxx] r283659 - [cmake] Split linked libraries into
> > private & public, for linker script
> > 
> > - Original Message -
> > > From: "Hal Finkel via cfe-commits" 
> > > To: "Michal Gorny" 
> > > Cc: cfe-commits@lists.llvm.org
> > > Sent: Saturday, October 8, 2016 6:37:50 PM
> > > Subject: Re: [libcxx] r283659 - [cmake] Split linked libraries
> > > into
> > > private & public, for linker script
> > > 
> > > Hi Michal,
> > > 
> > > All of the libc++ and libc++abi regression tests are now failing
> > > on
> > > my Linux build system with this error:
> > > 
> > >   /usr/bin/ld: cannot find -lcxxabi_shared
> > 
> > Alright, the problem here is that you've assumed, by doing this:
> > 
> > > >${LIBCXX_SOURCE_DIR}/utils/gen_link_script/gen_link_script.py
> > > >  ARGS
> > > >"$"
> > > > -  "${SCRIPT_ABI_LIBNAME}"
> > > > +  "\"${LIBCXX_LIBRARIES_PUBLIC}\""
> 
> Ah, I see. The code right about this used to handle this situation:
> 
>   # Get the name of the ABI library and handle the case where
>   CXXABI_LIBNAME
>   # is a target name and not a library. Ex cxxabi_shared.
>   set(SCRIPT_ABI_LIBNAME "${LIBCXX_CXX_ABI_LIBRARY}")
>   if (SCRIPT_ABI_LIBNAME STREQUAL "cxxabi_shared")
> set(SCRIPT_ABI_LIBNAME "c++abi")
>   endif()
> 
> and now it doesn't because you no longer use SCRIPT_ABI_LIBNAME as
> the argument to gen_link_script.py. Let me see if I can fix this...

r283684

 -Hal

> 
>  -Hal
> 
> > 
> > that all of the items in the LIBCXX_LIBRARIES_PUBLIC list are
> > actual
> > library names. They might not be, but rather, they might be CMake
> > targets instead:
> > 
> > elseif ("${LIBCXX_CXX_ABI_LIBNAME}" STREQUAL "libcxxabi")
> >   if (LIBCXX_CXX_ABI_INTREE)
> > # Link against just-built "cxxabi" target.
> > if (LIBCXX_ENABLE_STATIC_ABI_LIBRARY)
> > set(CXXABI_LIBNAME cxxabi_static)
> > else()
> > set(CXXABI_LIBNAME cxxabi_shared)
> > endif()
> > set(LIBCXX_LIBCPPABI_VERSION "2" PARENT_SCOPE)
> >   else()
> > # Assume c++abi is installed in the system, rely on -lc++abi
> > link
> > flag.
> > set(CXXABI_LIBNAME "c++abi")
> >   endif()
> > 
> > So we might have cxxabi_static or cxxabi_shared in the list.
> > 
> >  -Hal
> > 
> > > 
> > > My build directory has only these:
> > > 
> > > lib/libc++.a
> > > lib/libc++abi.so
> > > lib/libc++abi.so.1
> > > lib/libc++abi.so.1.0
> > > lib/libc++experimental.a
> > > lib/libc++.so
> > > lib/libc++.so.1
> > > lib/libc++.so.1.0
> > > 
> > >  -Hal
> > > 
> > > - Original Message -
> > > > From: "Michal Gorny via cfe-commits"
> > > > 
> > > > To: cfe-commits@lists.llvm.org
> > > > Sent: Saturday, October 8, 2016 5:27:46 AM
> > > > Subject: [libcxx] r283659 - [cmake] Split linked libraries into
> > > > private & public, for linker script
> > > > 
> > > > Author: mgorny
> > > > Date: Sat Oct  8 05:27:45 2016
> > > > New Revision: 283659
> > > > 
> > > > URL: http://llvm.org/viewvc/llvm-project?rev=283659&view=rev
> > > > Log:
> > > > [cmake] Split linked libraries into private & public, for
> > > > linker
> > > > script
> > > > 
> > > > Introduce LIBCXX_LIBRARIES_PUBLIC in addition to
> > > > LIBCXX_LIBRARIES
> > > > that
> > > > holds 'public' interface libraries -- that is, libraries that
> > > > both
> > > > libc++ links to and programs linked against it need to link to.
> 

r283685 - When optimizing for size, enable loop rerolling by default

2016-10-08 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Sat Oct  8 22:06:31 2016
New Revision: 283685

URL: http://llvm.org/viewvc/llvm-project?rev=283685&view=rev
Log:
When optimizing for size, enable loop rerolling by default

We have a loop-rerolling optimization which can be enabled by using
-freroll-loops. While sometimes loops are hand-unrolled for performance
reasons, when optimizing for size, we should always undo this manual
optimization to produce smaller code (our optimizer's unroller will still
unroll the rerolled loops if it thinks that is a good idea).

Modified:
cfe/trunk/lib/Driver/Tools.cpp
cfe/trunk/test/Driver/clang_f_opts.c

Modified: cfe/trunk/lib/Driver/Tools.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Tools.cpp?rev=283685&r1=283684&r2=283685&view=diff
==
--- cfe/trunk/lib/Driver/Tools.cpp (original)
+++ cfe/trunk/lib/Driver/Tools.cpp Sat Oct  8 22:06:31 2016
@@ -5227,9 +5227,18 @@ void Clang::ConstructJob(Compilation &C,
   }
 
   if (Arg *A = Args.getLastArg(options::OPT_freroll_loops,
-   options::OPT_fno_reroll_loops))
+   options::OPT_fno_reroll_loops)) {
 if (A->getOption().matches(options::OPT_freroll_loops))
   CmdArgs.push_back("-freroll-loops");
+  } else if (Arg *A = Args.getLastArg(options::OPT_O_Group)) {
+// If rerolling is not explicitly enabled or disabled, then enable when
+// optimizing for size.
+if (A->getOption().matches(options::OPT_O)) {
+  StringRef S(A->getValue());
+  if (S == "s" || S == "z")
+CmdArgs.push_back("-freroll-loops");
+}
+  }
 
   Args.AddLastArg(CmdArgs, options::OPT_fwritable_strings);
   Args.AddLastArg(CmdArgs, options::OPT_funroll_loops,

Modified: cfe/trunk/test/Driver/clang_f_opts.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/clang_f_opts.c?rev=283685&r1=283684&r2=283685&view=diff
==
--- cfe/trunk/test/Driver/clang_f_opts.c (original)
+++ cfe/trunk/test/Driver/clang_f_opts.c Sat Oct  8 22:06:31 2016
@@ -47,7 +47,12 @@
 // CHECK-NO-UNROLL-LOOPS: "-fno-unroll-loops"
 
 // RUN: %clang -### -S -freroll-loops %s 2>&1 | FileCheck 
-check-prefix=CHECK-REROLL-LOOPS %s
+// RUN: %clang -### -S -Os %s 2>&1 | FileCheck 
-check-prefix=CHECK-REROLL-LOOPS %s
+// RUN: %clang -### -S -Oz %s 2>&1 | FileCheck 
-check-prefix=CHECK-REROLL-LOOPS %s
 // RUN: %clang -### -S -fno-reroll-loops %s 2>&1 | FileCheck 
-check-prefix=CHECK-NO-REROLL-LOOPS %s
+// RUN: %clang -### -S -Os -fno-reroll-loops %s 2>&1 | FileCheck 
-check-prefix=CHECK-NO-REROLL-LOOPS %s
+// RUN: %clang -### -S -Oz -fno-reroll-loops %s 2>&1 | FileCheck 
-check-prefix=CHECK-NO-REROLL-LOOPS %s
+// RUN: %clang -### -S -O1 %s 2>&1 | FileCheck 
-check-prefix=CHECK-NO-REROLL-LOOPS %s
 // RUN: %clang -### -S -fno-reroll-loops -freroll-loops %s 2>&1 | FileCheck 
-check-prefix=CHECK-REROLL-LOOPS %s
 // RUN: %clang -### -S -freroll-loops -fno-reroll-loops %s 2>&1 | FileCheck 
-check-prefix=CHECK-NO-REROLL-LOOPS %s
 // CHECK-REROLL-LOOPS: "-freroll-loops"


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25387: When optimizing for size, enable loop rerolling by default.

2016-10-08 Thread Hal Finkel via cfe-commits
hfinkel closed this revision.
hfinkel added a comment.

r283685


https://reviews.llvm.org/D25387



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25308: [Sema] Ignore transparent_union attributes in C++

2016-10-10 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D25308#566176, @arphaman wrote:

> The updated patch now makes clang warn every time it encounters this 
> attribute in C++ mode. Would that be the desired behaviour?


As I understand it, transparent_union was designed for use in system headers, 
and these might certainly be included into C++ source files. Does that not work 
correctly, or does the feature just not work correctly when combined with 
C++-specific features (templates, function overloading, etc.)?


Repository:
  rL LLVM

https://reviews.llvm.org/D25308



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D25308: [Sema] Ignore transparent_union attributes in C++

2016-10-10 Thread Hal Finkel via cfe-commits
- Original Message -

> From: "Richard Smith" 
> To: reviews+d25308+public+96c9b20dd11b9...@reviews.llvm.org, "Hal
> Finkel" 
> Cc: "Alex L" , "Reid Kleckner" ,
> "Aaron Ballman" , "cfe-commits"
> 
> Sent: Monday, October 10, 2016 2:16:13 PM
> Subject: Re: [PATCH] D25308: [Sema] Ignore transparent_union
> attributes in C++

> On Mon, Oct 10, 2016 at 10:45 AM, Hal Finkel via cfe-commits <
> cfe-commits@lists.llvm.org > wrote:

> > hfinkel added a comment.
> 

> > In https://reviews.llvm.org/D25308#566176 , @arphaman wrote:
> 

> > > The updated patch now makes clang warn every time it encounters
> > > this attribute in C++ mode. Would that be the desired behaviour?
> 

> > As I understand it, transparent_union was designed for use in
> > system
> > headers, and these might certainly be included into C++ source
> > files. Does that not work correctly, or does the feature just not
> > work correctly when combined with C++-specific features (templates,
> > function overloading, etc.)?
> 
> Neither Clang nor GCC supports this attribute in C++ mode in any way,
> as far as I can see. All uses of this attribute within the glibc
> headers are behind #ifndef __cplusplus.
Indeed. Also, GCC seems to have a regression test as well to verify that the 
attribute is ignored in C++ mode. 

Thanks again, 
Hal 

-- 

Hal Finkel 
Lead, Compiler Technology and Programming Languages 
Leadership Computing Facility 
Argonne National Laboratory 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25225: Add an option to save the backend-produced YAML optimization record to a file

2016-10-10 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D25225#566854, @rsmith wrote:

> As discussed on IRC, I have a mild concern about using 
> `-fsave-optimization-record` (with no argument) to enable the feature, and 
> `-fsave-optimization-record=X` to enable the feature and specify a filename; 
> in most (but not all) cases, `-option arg` and `-option=arg` mean the same 
> thing. Other than that, this looks good to me.


For the record (pun intended, I suppose), we discussed on IRC adding a separate 
-foptimization-record-file=filename to set the file name.


https://reviews.llvm.org/D25225



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r283834 - Add an option to save the backend-produced YAML optimization record to a file

2016-10-10 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Mon Oct 10 19:26:09 2016
New Revision: 283834

URL: http://llvm.org/viewvc/llvm-project?rev=283834&view=rev
Log:
Add an option to save the backend-produced YAML optimization record to a file

The backend now has the capability to save information from optimizations, the
same information that can be used to generate optimization diagnostics but in
machine-consumable form, into an output file. This can be enabled when using
opt (see r282539), and this change enables it when using clang. The idea is
that other tools will be able to consume these files, and perhaps in
combination with the original source code, produce various kinds of
optimization reports for users (and for compiler developers).

We now have at-least two tools that can consume these files:
  * tools/llvm-opt-report
  * utils/opt-viewer

Using the flag -fsave-optimization-record will cause the YAML file to be
generated; the file name will be based on the output file name (if we're using
-c or -S and have an output name), or the input file name. When we're using
CUDA, or some other offloading mechanism, separate files are generated for each
backend target. The output file name can be specified by the user using
-foptimization-record-file=filename.

Differential Revision: https://reviews.llvm.org/D25225

Added:
cfe/trunk/test/CodeGen/Inputs/opt-record.proftext
cfe/trunk/test/CodeGen/opt-record.c
cfe/trunk/test/Driver/opt-record.c
Modified:
cfe/trunk/include/clang/Driver/CC1Options.td
cfe/trunk/include/clang/Driver/Options.td
cfe/trunk/include/clang/Frontend/CodeGenOptions.h
cfe/trunk/lib/CodeGen/CodeGenAction.cpp
cfe/trunk/lib/Driver/Tools.cpp
cfe/trunk/lib/Frontend/CompilerInvocation.cpp

Modified: cfe/trunk/include/clang/Driver/CC1Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/CC1Options.td?rev=283834&r1=283833&r2=283834&view=diff
==
--- cfe/trunk/include/clang/Driver/CC1Options.td (original)
+++ cfe/trunk/include/clang/Driver/CC1Options.td Mon Oct 10 19:26:09 2016
@@ -507,6 +507,9 @@ def arcmt_modify : Flag<["-"], "arcmt-mo
 def arcmt_migrate : Flag<["-"], "arcmt-migrate">,
   HelpText<"Apply modifications and produces temporary files that conform to 
ARC">;
 
+def opt_record_file : Separate<["-"], "opt-record-file">,
+  HelpText<"File name to use for YAML optimization record output">;
+
 def print_stats : Flag<["-"], "print-stats">,
   HelpText<"Print performance metrics and statistics">;
 def stats_file : Joined<["-"], "stats-file=">,

Modified: cfe/trunk/include/clang/Driver/Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=283834&r1=283833&r2=283834&view=diff
==
--- cfe/trunk/include/clang/Driver/Options.td (original)
+++ cfe/trunk/include/clang/Driver/Options.td Mon Oct 10 19:26:09 2016
@@ -1192,6 +1192,15 @@ def ftemplate_backtrace_limit_EQ : Joine
Group;
 def foperator_arrow_depth_EQ : Joined<["-"], "foperator-arrow-depth=">,
Group;
+
+def fsave_optimization_record : Flag<["-"], "fsave-optimization-record">,
+  Group, HelpText<"Generate a YAML optimization record file">;
+def fno_save_optimization_record : Flag<["-"], "fno-save-optimization-record">,
+  Group, Flags<[NoArgumentUnused]>;
+def foptimization_record_file_EQ : Joined<["-"], "foptimization-record-file=">,
+  Group,
+  HelpText<"Specify the file name of any generated YAML optimization record">;
+
 def ftest_coverage : Flag<["-"], "ftest-coverage">, Group;
 def fvectorize : Flag<["-"], "fvectorize">, Group,
   HelpText<"Enable the loop vectorization passes">;

Modified: cfe/trunk/include/clang/Frontend/CodeGenOptions.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Frontend/CodeGenOptions.h?rev=283834&r1=283833&r2=283834&view=diff
==
--- cfe/trunk/include/clang/Frontend/CodeGenOptions.h (original)
+++ cfe/trunk/include/clang/Frontend/CodeGenOptions.h Mon Oct 10 19:26:09 2016
@@ -181,6 +181,10 @@ public:
   /// object file.
   std::vector CudaGpuBinaryFileNames;
 
+  /// The name of the file to which the backend should save YAML optimization
+  /// records.
+  std::string OptRecordFile;
+
   /// Regular expression to select optimizations for which we should enable
   /// optimization remarks. Transformation passes whose name matches this
   /// expression (and support this feature), will emit a diagnostic

Modified: cfe/trunk/lib/CodeGen/CodeGenAction.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenAction.cpp?rev=283834&r1=283833&r2=283834&view=diff
==
--- cfe/trunk/lib/CodeGen/CodeGenAction.cpp (original)
+++ cfe/trunk/lib/Cod

r283839 - Fixup test/Driver/opt-record.c for nvptx pointer size

2016-10-10 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Mon Oct 10 20:05:45 2016
New Revision: 283839

URL: http://llvm.org/viewvc/llvm-project?rev=283839&view=rev
Log:
Fixup test/Driver/opt-record.c for nvptx pointer size

On some systems, it looks like nvptx is used instead of nvptx64.

Modified:
cfe/trunk/test/Driver/opt-record.c

Modified: cfe/trunk/test/Driver/opt-record.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/opt-record.c?rev=283839&r1=283838&r2=283839&view=diff
==
--- cfe/trunk/test/Driver/opt-record.c (original)
+++ cfe/trunk/test/Driver/opt-record.c Mon Oct 10 20:05:45 2016
@@ -11,7 +11,7 @@
 
 // CHECK-NO-O: "-cc1"
 // CHECK-NO-O-DAG: "-opt-record-file" "opt-record.opt.yaml"
-// CHECK-CUDA-DEV-DAG: "-opt-record-file" 
"opt-record-device-cuda-nvptx64-nvidia-cuda-sm_20.opt.yaml"
+// CHECK-CUDA-DEV-DAG: "-opt-record-file" 
"opt-record-device-cuda-{{nvptx64|nvptx}}-nvidia-cuda-sm_20.opt.yaml"
 
 // CHECK-EQ: "-cc1"
 // CHECK-EQ: "-opt-record-file" "BAR.txt"


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25403: [CUDA] Mark __libcpp_{isnan, isinf, isfinite} as constexpr.

2016-10-10 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D25403#565603, @jlebar wrote:

> Although these pass the CUDA test-suite tests (which I haven't yet committed 
> because they're broken without this change), I could use some help running 
> the libcxx tests.
>
> I cannot find any documentation explaining how to run the libcxx tests with 
> just-built clang.  Presumably this is supported and I just cannot figure it 
> out.  Anyway I figured I'd try to make a new objdir and point it to my 
> just-built clang.
>
>   $ mkdir objdir-libcxx
>   $ cd objdir-libcxx
>   $ cmake -G "Ninja" -DCMAKE_BUILD_TYPE=Release 
> -DCMAKE_C_COMPILER=$HOME/llvm/release/bin/clang 
> -DCMAKE_CXX_COMPILER=$HOME/llvm/release/bin/clang -DLLVM_ENABLE_ASSERTIONS=On 
> ../llvm
>   
>
> This also doesn't work.  Building one of cmake's atomic tests fails with 
> linker errors.
>
>   Run Build Command:"/usr/local/google/home/jlebar/bin/ninja" "cmTC_0c40a"
>   [1/2] Building CXX object CMakeFiles/cmTC_0c40a.dir/src.cxx.o
>   [2/2] Linking CXX executable cmTC_0c40a
>   FAILED: cmTC_0c40a
>   : && /usr/local/google/home/jlebar/llvm/release/bin/clang   
> -DHAVE_CXX_ATOMICS_WITH_LIB -std=c++11   CMakeFiles/cmTC_0c40a.dir/src.cxx.o  
> -o cmTC_0c40a  -lm -latomic && :
>   CMakeFiles/cmTC_0c40a.dir/src.cxx.o:src.cxx:function 
> __clang_call_terminate: error: undefined reference to '__cxa_begin_catch'
>   CMakeFiles/cmTC_0c40a.dir/src.cxx.o:src.cxx:function 
> __clang_call_terminate: error: undefined reference to 'std::terminate()'
>   CMakeFiles/cmTC_0c40a.dir/src.cxx.o(.eh_frame+0x143): error: undefined 
> reference to '__gxx_personality_v0'
>   clang-4.0: error: linker command failed with exit code 1 (use -v to see 
> invocation)
>   ninja: build stopped: subcommand failed.
>   
>   Source file was:
>   
>   #include 
>   std::atomic x;
>   int main() {
> return x;
>   }
>   
>
> I presume there is some Right Way to do this, and I'm just not doing it.


The tests should be runnable with lit. I generally just do an in-tree build and 
run make check-libcxx. @EricWF , what's the recommended way of running the 
tests from an out-of-tree build?


https://reviews.llvm.org/D25403



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r283680 - [CUDA] Support and std::min/max on the device.

2016-10-10 Thread Hal Finkel via cfe-commits
- Original Message -
> From: "Justin Lebar" 
> To: "Hal Finkel" 
> Cc: "Clang Commits" 
> Sent: Saturday, October 8, 2016 10:56:37 PM
> Subject: Re: r283680 - [CUDA] Support  and std::min/max on the 
> device.
> 
> > > The fix is not as simple as simply changing our implementation of
> > e.g.
> > std::isnan to call __builtin_isnanf, because we also would want to
> > fix
> > ::isnanf,
> >
> > No, if I understand what you're saying, you specifically wouldn't.
> 
> I understand how this is feasible on the CPU, because ::isnan is a
> library function that can never be inlined.  But on the GPU, these
> library functions are (at the moment) always declared inline.  That
> seems to complicate this idea.
> 
> Right now ::isnan(x) is going to call __nv_isnan(x), which computes
> abs(x).  If we pass -ffast-math, the compiler will be able to assume
> that abs(x) is not nan.  I guess you're saying that we would need to
> special-case __nv_isnan so that -ffast-math is always off
> (essentially).  But, what if it gets inlined?
> 
> It looks like libstdc++'s std::isnan calls __builtin_isnan (same for
> its std::isinf), and its ::isnan is an alias for std::isnan.  So
> libstdc++'s isnan is going to return false with -ffast-math (or
> anyway
> it will do the same thing as the builtin functions, which aiui is
> what
> you're proposing libc++'s isnan *not* do).

This was not my first choice, but was the direction that Marshall preferred 
based on our conversations up to that point. I had not noticed this aspect of 
libstdc++'s behavior. It is indeed the case that, with libstdc++, std::isnan 
gets optimized away with -ffast-math, but ::isnan does not. That might be 
desirable, or it might just be weird given that I'd expect std::isnan and 
::isnan to essentally do the same thing for POD FP types.

> 
> > This is important for use cases where, for example, even though the
> > user might want fast math, they still need to check their inputs
> > for NaNs.
> 
> Since this isn't going to work with libstdc++, and it relies on not
> doing anything that the compiler might construe as "arithmetic" on
> the
> value, this seems pretty dicey to me.  One could instead compile a
> separate TU without -ffast-math and do all their validation there?
> I'd have a lot more confidence in that working today, continuing to
> work tomorrow, and being portable across compilers and standard
> libraries.

I certainly agree that I have a higher confidence in the multiple TU approach.

> 
> I don't mean to relitigate https://reviews.llvm.org/D18639, but I am
> not convinced that libc++'s isnan should have a path that returns
> true
> with -ffast-math, given that
> 
>  * libstdc++'s isnan will always return false with -ffast-math,
>  * it's at best complicated for us to make this work if you can
>  inline
> the body of isnan (as we can on the GPU),
>  * it's at best complicated for users to write "correct" C++ that
> calls isnan with -ffast-math, especially if they want their code to
> continue to work in the future in the face of changing compilers
> (-ffast-math is not specified anywhere, so who knows what it means),
> and
>  * there's a relatively simple workaround (use a separate TU) that
> sidesteps all these problems.
> 
> I'm not saying we should go in and change libc++'s CPU implementation
> of isnan to call the builtin.  I'll leave that up to people who care
> about CPU code.  But at least on the GPU, it still makes sense to me
> to fix the problem you originally identified by making
> std::/::isnan/isinf always return false/true with -ffast-math.  Which
> I think we should be able to do with the intrinsic upgrade I
> originally suggested.
> 
> On a separate note: Can we make __libcpp_isnan and __libcpp_isinf
> constexpr?  This will make them implicitly host+device functions,
> solving the problem on the GPU.  Otherwise I may have to reimplement
> these functions in a header, and that's lame.  Although I am clearly
> not above that.  :)

I think this makes sense ;) We should check with Eric or Marshall.

 -Hal

> 
> On Sat, Oct 8, 2016 at 6:50 PM, Hal Finkel  wrote:
> > - Original Message -
> >> From: "Justin Lebar" 
> >> To: "Hal Finkel" 
> >> Cc: "Clang Commits" 
> >> Sent: Saturday, October 8, 2016 6:16:12 PM
> >> Subject: Re: r283680 - [CUDA] Support  and std::min/max
> >> on the device.
> >>
> >> Hal,
> >>
> >> On NVPTX, these functions eventually get resolved to function
> >> calls
> >> in
> >> libdevice, e.g. __nv_isinff and __nv_isnanf.
> >>
> >> llvm does not do a good job understanding the body of e.g.
> >> __nvvm_isnanf, because it uses nvptx-specific intrinsic functions,
> >> notably @llvm.nvvm.fabs.f.  These are opaque to the LLVM
> >> optimizer.
> >>
> >> The fix is not as simple as simply changing our implementation of
> >> e.g.
> >> std::isnan to call __builtin_isnanf, because we also would want to
> >> fix
> >> ::isnanf,
> >
> > No, if I understand what you're saying, you specifically wouldn't.
> > We had a discu

[PATCH] D25225: Add an option to save the backend-produced YAML optimization record to a file

2016-10-11 Thread Hal Finkel via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL283834: Add an option to save the backend-produced YAML 
optimization record to a file (authored by hfinkel).

Changed prior to commit:
  https://reviews.llvm.org/D25225?vs=74001&id=74214#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D25225

Files:
  cfe/trunk/include/clang/Driver/CC1Options.td
  cfe/trunk/include/clang/Driver/Options.td
  cfe/trunk/include/clang/Frontend/CodeGenOptions.h
  cfe/trunk/lib/CodeGen/CodeGenAction.cpp
  cfe/trunk/lib/Driver/Tools.cpp
  cfe/trunk/lib/Frontend/CompilerInvocation.cpp
  cfe/trunk/test/CodeGen/Inputs/opt-record.proftext
  cfe/trunk/test/CodeGen/opt-record.c
  cfe/trunk/test/Driver/opt-record.c

Index: cfe/trunk/lib/CodeGen/CodeGenAction.cpp
===
--- cfe/trunk/lib/CodeGen/CodeGenAction.cpp
+++ cfe/trunk/lib/CodeGen/CodeGenAction.cpp
@@ -33,6 +33,8 @@
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/SourceMgr.h"
 #include "llvm/Support/Timer.h"
+#include "llvm/Support/ToolOutputFile.h"
+#include "llvm/Support/YAMLTraits.h"
 #include 
 using namespace clang;
 using namespace llvm;
@@ -181,6 +183,24 @@
   Ctx.setDiagnosticHandler(DiagnosticHandler, this);
   Ctx.setDiagnosticHotnessRequested(CodeGenOpts.DiagnosticsWithHotness);
 
+  std::unique_ptr OptRecordFile;
+  if (!CodeGenOpts.OptRecordFile.empty()) {
+std::error_code EC;
+OptRecordFile =
+  llvm::make_unique(CodeGenOpts.OptRecordFile,
+EC, sys::fs::F_None);
+if (EC) {
+  Diags.Report(diag::err_cannot_open_file) <<
+CodeGenOpts.OptRecordFile << EC.message();
+  return;
+}
+
+Ctx.setDiagnosticsOutputFile(new yaml::Output(OptRecordFile->os()));
+
+if (CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone)
+  Ctx.setDiagnosticHotnessRequested(true);
+  }
+
   // Link LinkModule into this module if present, preserving its validity.
   for (auto &I : LinkModules) {
 unsigned LinkFlags = I.first;
@@ -198,6 +218,9 @@
   Ctx.setInlineAsmDiagnosticHandler(OldHandler, OldContext);
 
   Ctx.setDiagnosticHandler(OldDiagnosticHandler, OldDiagnosticContext);
+
+  if (OptRecordFile)
+OptRecordFile->keep();
 }
 
 void HandleTagDeclDefinition(TagDecl *D) override {
Index: cfe/trunk/lib/Driver/Tools.cpp
===
--- cfe/trunk/lib/Driver/Tools.cpp
+++ cfe/trunk/lib/Driver/Tools.cpp
@@ -6080,6 +6080,39 @@
 CmdArgs.push_back("-fno-math-builtin");
   }
 
+  if (Args.hasFlag(options::OPT_fsave_optimization_record,
+   options::OPT_fno_save_optimization_record, false)) {
+CmdArgs.push_back("-opt-record-file");
+
+const Arg *A = Args.getLastArg(options::OPT_foptimization_record_file_EQ);
+if (A) {
+  CmdArgs.push_back(A->getValue());
+} else {
+  SmallString<128> F;
+  if (Output.isFilename() && (Args.hasArg(options::OPT_c) ||
+  Args.hasArg(options::OPT_S))) {
+F = Output.getFilename();
+  } else {
+// Use the compilation directory.
+F = llvm::sys::path::stem(Input.getBaseInput());
+
+// If we're compiling for an offload architecture (i.e. a CUDA device),
+// we need to make the file name for the device compilation different
+// from the host compilation.
+if (!JA.isDeviceOffloading(Action::OFK_None) &&
+!JA.isDeviceOffloading(Action::OFK_Host)) {
+  llvm::sys::path::replace_extension(F, "");
+  F += JA.getOffloadingFileNamePrefix(Triple.normalize());
+  F += "-";
+  F += JA.getOffloadingArch();
+}
+  }
+
+  llvm::sys::path::replace_extension(F, "opt.yaml");
+  CmdArgs.push_back(Args.MakeArgString(F));
+}
+  }
+
 // Default to -fno-builtin-str{cat,cpy} on Darwin for ARM.
 //
 // FIXME: Now that PR4941 has been fixed this can be enabled.
Index: cfe/trunk/lib/Frontend/CompilerInvocation.cpp
===
--- cfe/trunk/lib/Frontend/CompilerInvocation.cpp
+++ cfe/trunk/lib/Frontend/CompilerInvocation.cpp
@@ -826,6 +826,10 @@
   Opts.LinkerOptions = Args.getAllArgValues(OPT_linker_option);
   bool NeedLocTracking = false;
 
+  Opts.OptRecordFile = Args.getLastArgValue(OPT_opt_record_file);
+  if (!Opts.OptRecordFile.empty())
+NeedLocTracking = true;
+
   if (Arg *A = Args.getLastArg(OPT_Rpass_EQ)) {
 Opts.OptimizationRemarkPattern =
 GenerateOptimizationRemarkRegex(Diags, Args, A);
Index: cfe/trunk/include/clang/Driver/Options.td
===
--- cfe/trunk/include/clang/Driver/Options.td
+++ cfe/trunk/include/clang/Driver/Options.td
@@ -1192,6 +1192,15 @@
   

Re: r283685 - When optimizing for size, enable loop rerolling by default

2016-10-11 Thread Hal Finkel via cfe-commits
Hi Chris, 

Thanks! Can you (or someone else) revert this. I won't be able to look at it 
until tonight. 

-Hal 

- Original Message -

> From: "Chris Matthews" 
> To: "Hal Finkel" , cfe-commits@lists.llvm.org
> Sent: Tuesday, October 11, 2016 2:32:33 PM
> Subject: Re: r283685 - When optimizing for size, enable loop
> rerolling by default

> I noticed since this commit there is a test-suite failure:

> http://lab.llvm.org:8080/green/job/perf_darwin_x86_Osflto/64/

> SingleSource.Benchmarks.Adobe-C++.loop_unroll appears to be failing.

> Tailing the output of the program gets:

> …
> test 236 failed
> test 236 failed
> test 236 failed
> test 236 failed
> test 236 failed
> test 236 failed
> test 236 failed

> On October 8, 2016 at 8:15:40 PM, Hal Finkel via cfe-commits (
> cfe-commits@lists.llvm.org ) wrote:
> > Author: hfinkel
> 
> > Date: Sat Oct 8 22:06:31 2016
> 
> > New Revision: 283685
> 

> > URL: http://llvm.org/viewvc/llvm-project?rev=283685&view=rev
> 
> > Log:
> 
> > When optimizing for size, enable loop rerolling by default
> 

> > We have a loop-rerolling optimization which can be enabled by using
> 
> > -freroll-loops. While sometimes loops are hand-unrolled for
> > performance
> 
> > reasons, when optimizing for size, we should always undo this
> > manual
> 
> > optimization to produce smaller code (our optimizer's unroller will
> > still
> 
> > unroll the rerolled loops if it thinks that is a good idea).
> 

> > Modified:
> 
> > cfe/trunk/lib/Driver/Tools.cpp
> 
> > cfe/trunk/test/Driver/clang_f_opts.c
> 

> > Modified: cfe/trunk/lib/Driver/Tools.cpp
> 
> > URL:
> > http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Tools.cpp?rev=283685&r1=283684&r2=283685&view=diff
> 
> > ==
> 
> > --- cfe/trunk/lib/Driver/Tools.cpp (original)
> 
> > +++ cfe/trunk/lib/Driver/Tools.cpp Sat Oct 8 22:06:31 2016
> 
> > @@ -5227,9 +5227,18 @@ void Clang::ConstructJob(Compilation &C,
> 
> > }
> 

> > if (Arg *A = Args.getLastArg(options::OPT_freroll_loops,
> 
> > - options::OPT_fno_reroll_loops))
> 
> > + options::OPT_fno_reroll_loops)) {
> 
> > if (A->getOption().matches(options::OPT_freroll_loops))
> 
> > CmdArgs.push_back("-freroll-loops");
> 
> > + } else if (Arg *A = Args.getLastArg(options::OPT_O_Group)) {
> 
> > + // If rerolling is not explicitly enabled or disabled, then
> > enable
> > when
> 
> > + // optimizing for size.
> 
> > + if (A->getOption().matches(options::OPT_O)) {
> 
> > + StringRef S(A->getValue());
> 
> > + if (S == "s" || S == "z")
> 
> > + CmdArgs.push_back("-freroll-loops");
> 
> > + }
> 
> > + }
> 

> > Args.AddLastArg(CmdArgs, options::OPT_fwritable_strings);
> 
> > Args.AddLastArg(CmdArgs, options::OPT_funroll_loops,
> 

> > Modified: cfe/trunk/test/Driver/clang_f_opts.c
> 
> > URL:
> > http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/clang_f_opts.c?rev=283685&r1=283684&r2=283685&view=diff
> 
> > ==
> 
> > --- cfe/trunk/test/Driver/clang_f_opts.c (original)
> 
> > +++ cfe/trunk/test/Driver/clang_f_opts.c Sat Oct 8 22:06:31 2016
> 
> > @@ -47,7 +47,12 @@
> 
> > // CHECK-NO-UNROLL-LOOPS: "-fno-unroll-loops"
> 

> > // RUN: %clang -### -S -freroll-loops %s 2>&1 | FileCheck
> > -check-prefix=CHECK-REROLL-LOOPS %s
> 
> > +// RUN: %clang -### -S -Os %s 2>&1 | FileCheck
> > -check-prefix=CHECK-REROLL-LOOPS %s
> 
> > +// RUN: %clang -### -S -Oz %s 2>&1 | FileCheck
> > -check-prefix=CHECK-REROLL-LOOPS %s
> 
> > // RUN: %clang -### -S -fno-reroll-loops %s 2>&1 | FileCheck
> > -check-prefix=CHECK-NO-REROLL-LOOPS %s
> 
> > +// RUN: %clang -### -S -Os -fno-reroll-loops %s 2>&1 | FileCheck
> > -check-prefix=CHECK-NO-REROLL-LOOPS %s
> 
> > +// RUN: %clang -### -S -Oz -fno-reroll-loops %s 2>&1 | FileCheck
> > -check-prefix=CHECK-NO-REROLL-LOOPS %s
> 
> > +// RUN: %clang -### -S -O1 %s 2>&1 | FileCheck
> > -check-prefix=CHECK-NO-REROLL-LOOPS %s
> 
> > // RUN: %clang -### -S -fno-reroll-loops -freroll-loops %s 2>&1 |
> > FileCheck -check-prefix=CHECK-REROLL-LOOPS %s
> 
> > // RUN: %clang -### -S -freroll-loops -fno-reroll-loops %s 2>&1 |
> > FileCheck -check-prefix=CHECK-NO-REROLL-LOOPS %s
> 
> > // CHECK-REROLL-LOOPS: "-freroll-loops"
> 

> > ___
> 
> > cfe-commits mailing list
> 
> > cfe-commits@lists.llvm.org
> 
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> 

-- 

Hal Finkel 
Lead, Compiler Technology and Programming Languages 
Leadership Computing Facility 
Argonne National Laboratory 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25491: [libcxx] Use C++14 when building libc++ with musl

2016-10-12 Thread Hal Finkel via cfe-commits
hfinkel added inline comments.



Comment at: CMakeLists.txt:327
 # Required flags ==
 set(LIBCXX_STANDARD_VER c++11 CACHE INTERNAL "internal option to change build 
dialect")
 add_compile_flags_if_supported(-std=${LIBCXX_STANDARD_VER})

EricWF wrote:
> phosek wrote:
> > EricWF wrote:
> > > Why not just set `LIBCXX_STANDARD_VER` differently instead of replacing 
> > > it after the fact?
> > I totally missed it; this change was a part of a downstream patch we were 
> > using for building Fuchsia toolchain and it predates this option. Using 
> > this option, I can override the dialect only for our build, which is 
> > perfectly fine for Fuchsia since we default to C++14. I'd be happy to 
> > abandon this patch unless you want to persist that setting for musl?
> Since we support MUSL it would be nice if libc++ built out of the box. Making 
> the option persistent for MUSL makes the most sense to me.
We should add a comment here, or where ever this logic ends up going, to 
explain why this is needed.


Repository:
  rL LLVM

https://reviews.llvm.org/D25491



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D9403: llvm.noalias - Clang CodeGen for local restrict-qualified pointers

2016-10-12 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

Ping.


https://reviews.llvm.org/D9403



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D25491: [libcxx] Use C++14 when building libc++ with musl

2016-10-15 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D25491#571003, @phosek wrote:

> Ping, do you have any other comments?


Fine by me. Please wait for an okay by @EricWF .


Repository:
  rL LLVM

https://reviews.llvm.org/D25491



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r319629 - Revert "[CodeGen] Add initial support for union members in TBAA"

2017-12-02 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Sat Dec  2 19:10:13 2017
New Revision: 319629

URL: http://llvm.org/viewvc/llvm-project?rev=319629&view=rev
Log:
Revert "[CodeGen] Add initial support for union members in TBAA"

This reverts commit r319413. See PR35503.

We can't use "union member" as the access type here like this.

Removed:
cfe/trunk/test/CodeGen/tbaa-union.cpp
Modified:
cfe/trunk/lib/CodeGen/CGExpr.cpp
cfe/trunk/lib/CodeGen/CodeGenModule.h
cfe/trunk/lib/CodeGen/CodeGenTBAA.cpp
cfe/trunk/lib/CodeGen/CodeGenTBAA.h
cfe/trunk/test/CodeGen/union-tbaa1.c

Modified: cfe/trunk/lib/CodeGen/CGExpr.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGExpr.cpp?rev=319629&r1=319628&r2=319629&view=diff
==
--- cfe/trunk/lib/CodeGen/CGExpr.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGExpr.cpp Sat Dec  2 19:10:13 2017
@@ -3723,6 +3723,9 @@ LValue CodeGenFunction::EmitLValueForFie
   if (base.getTBAAInfo().isMayAlias() ||
   rec->hasAttr() || FieldType->isVectorType()) {
 FieldTBAAInfo = TBAAAccessInfo::getMayAliasInfo();
+  } else if (rec->isUnion()) {
+// TODO: Support TBAA for unions.
+FieldTBAAInfo = TBAAAccessInfo::getMayAliasInfo();
   } else {
 // If no base type been assigned for the base access, then try to generate
 // one for this base lvalue.
@@ -3733,26 +3736,16 @@ LValue CodeGenFunction::EmitLValueForFie
"Nonzero offset for an access with no base type!");
 }
 
-// All union members are encoded to be of the same special type.
-if (FieldTBAAInfo.BaseType && rec->isUnion())
-  FieldTBAAInfo = 
TBAAAccessInfo::getUnionMemberInfo(FieldTBAAInfo.BaseType,
- FieldTBAAInfo.Offset,
- FieldTBAAInfo.Size);
-
-// For now we describe accesses to direct and indirect union members as if
-// they were at the offset of their outermost enclosing union.
-if (!FieldTBAAInfo.isUnionMember()) {
-  // Adjust offset to be relative to the base type.
-  const ASTRecordLayout &Layout =
-  getContext().getASTRecordLayout(field->getParent());
-  unsigned CharWidth = getContext().getCharWidth();
-  if (FieldTBAAInfo.BaseType)
-FieldTBAAInfo.Offset +=
-Layout.getFieldOffset(field->getFieldIndex()) / CharWidth;
+// Adjust offset to be relative to the base type.
+const ASTRecordLayout &Layout =
+getContext().getASTRecordLayout(field->getParent());
+unsigned CharWidth = getContext().getCharWidth();
+if (FieldTBAAInfo.BaseType)
+  FieldTBAAInfo.Offset +=
+  Layout.getFieldOffset(field->getFieldIndex()) / CharWidth;
 
-  // Update the final access type.
-  FieldTBAAInfo.AccessType = CGM.getTBAATypeInfo(FieldType);
-}
+// Update the final access type.
+FieldTBAAInfo.AccessType = CGM.getTBAATypeInfo(FieldType);
   }
 
   Address addr = base.getAddress();

Modified: cfe/trunk/lib/CodeGen/CodeGenModule.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenModule.h?rev=319629&r1=319628&r2=319629&view=diff
==
--- cfe/trunk/lib/CodeGen/CodeGenModule.h (original)
+++ cfe/trunk/lib/CodeGen/CodeGenModule.h Sat Dec  2 19:10:13 2017
@@ -688,9 +688,8 @@ public:
   /// getTBAAInfoForSubobject - Get TBAA information for an access with a given
   /// base lvalue.
   TBAAAccessInfo getTBAAInfoForSubobject(LValue Base, QualType AccessType) {
-TBAAAccessInfo TBAAInfo = Base.getTBAAInfo();
-if (TBAAInfo.isMayAlias() || TBAAInfo.isUnionMember())
-  return TBAAInfo;
+if (Base.getTBAAInfo().isMayAlias())
+  return TBAAAccessInfo::getMayAliasInfo();
 return getTBAAAccessInfo(AccessType);
   }
 

Modified: cfe/trunk/lib/CodeGen/CodeGenTBAA.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenTBAA.cpp?rev=319629&r1=319628&r2=319629&view=diff
==
--- cfe/trunk/lib/CodeGen/CodeGenTBAA.cpp (original)
+++ cfe/trunk/lib/CodeGen/CodeGenTBAA.cpp Sat Dec  2 19:10:13 2017
@@ -74,10 +74,6 @@ llvm::MDNode *CodeGenTBAA::getChar() {
   return Char;
 }
 
-llvm::MDNode *CodeGenTBAA::getUnionMemberType(uint64_t Size) {
-  return createScalarTypeNode("union member", getChar(), Size);
-}
-
 static bool TypeHasMayAlias(QualType QTy) {
   // Tagged types have declarations, and therefore may have attributes.
   if (const TagType *TTy = dyn_cast(QTy))
@@ -105,8 +101,9 @@ static bool isValidBaseType(QualType QTy
   return false;
 if (RD->hasFlexibleArrayMember())
   return false;
-// For now, we do not allow interface classes to be base access types.
-if (RD->isStruct() || RD->isClass() || RD->isUnion())
+// RD can be struct, union, class, interface or enum.
+// For now, 

Re: [PATCH] D24933: Enable configuration files in clang

2017-08-05 Thread Hal Finkel via cfe-commits


On 07/24/2017 10:18 AM, Serge Pavlov wrote:
I am thinking about reducing the patch further to leave only the 
ability to include config file when clang is called as 
`target-clang-drivermode`. It is still useful for cross compilation 
tasks because:

- It is a convenient way to switch between supported targets,
- SDK producer can ship compiler with a set of appropriate options or 
prepare them during installation.
In this case if clang is called as `target-clang-drivermode`, it first 
tries to find file `target-drivermode.cfg` or `target.cfg` in a set of 
well-known directories, which in minimal case includes the directory 
where clang executable resides. If such file is found, options are 
 read from it, otherwise only option --target is added as clang does 
it now.


This solution has obvious drawbacks:
- User cannot specify config file in command line in the same way as 
he can choose a target: `clang --target `,
- On Windows symlinks are implemented as file copy, the solution looks 
awkward.
So more or less complete solution needs to allow specifying config 
file in command line.


I'd rather not reduce the patch in this way, and you didn't describe why 
you're considering reducing the patch. Can you please elaborate?




Using `@file` has some problems. Config file is merely a set of 
options, just as file included by `@file`. Different include file 
search is only a convenience and could be sacrificed. Comments and 
unused option warning suppression could be extended for all files 
included with `@file`. The real problem is the search path. To be 
useful, config files must be searched for in well-known directories, 
so that meaning of `clang @config_fille` does not depend on the 
current directory. So clang must have some rule to distinguish between 
config file and traditional use of `@file`. For instance, if file name 
ends with `.cfg` and there is a file with this name in config search 
directories, this is a config file and it is interpreted a bit 
differently. Of course, the file may be specified with full path, but 
this way is inconvenient.


I see no reason why we can't unify the processing but have different 
search-path rules for @file vs. --config file.




Another possible solution is to extend meaning of `--target` so that 
it fully matches with the use of `target-clang-drivermode`, that is 
the option `--target=hexagon` causes clang first to look for the file 
`hexagon.cfg` in well-known directories and use it if found. In this 
case treatment of `--target` is different if the option is specified 
in command line or in the content of config file (in the latter case 
it is processed as target name only), it may be confusing. Besides, 
use of config files is not restricted to the choice of target.


I think we should do this, so long as the implementation is reasonable, 
and the special case doesn't bother me in this regard. I don't view this 
as a replacement for '--config file', however, because, as you mention, 
the config files need not be restricted to target triples.


Thanks again,
Hal



Using special option for config files does not bring risk of 
compatibility breakage and does not change meaning of existing options.



Thanks,
--Serge

2017-05-10 11:25 GMT+07:00 Serge Pavlov >:


2017-05-10 3:46 GMT+07:00 Richard Smith mailto:rich...@metafoo.co.uk>>:

On 1 March 2017 at 02:50, Serge Pavlov via Phabricator
mailto:revi...@reviews.llvm.org>>
wrote:


Format of configuration file is similar to file used in
the construct `@file`, it is a set of options.
Configuration file have advantage over this construct:

- it is searched for in well-known places rather than in
current directory,


This (and suppressing unused-argument warnings) might well be
sufficient to justify a different command-line syntax rather
than @file...


Construct `@file` in this implementation is used only to read
parts of config file inside containing file. Driver knows that it
processes config file and can adjust treatment of `@file`. On the
other hand, driver might parse config files in a more complicated
way, for instance, it could treat line `# include(file_name)` as a
command to include another file.

- it may contain comments, long options may be split
between lines using trailing backslashes,
- other files may be included by `@file` and they will be
resolved relative to the including file,


... but I think we should just add these extensions to our
@file handling, and then use the exact same syntax and code to
handle config files and @file files. That is, the difference
between @ and --config would be that the latter looks in a
different directory and suppresses "unused argument" warnings,
but they would otherwise be identical.


Changing treatm

Re: [PATCH] D24933: Enable configuration files in clang

2017-08-06 Thread Hal Finkel via cfe-commits


On 08/06/2017 01:15 PM, Serge Pavlov wrote:
2017-08-06 6:43 GMT+07:00 Hal Finkel >:


On 07/24/2017 10:18 AM, Serge Pavlov wrote:


I am thinking about reducing the patch further to leave only the
ability to include config file when clang is called as
`target-clang-drivermode`. It is still useful for cross
compilation tasks because:
- It is a convenient way to switch between supported targets,
- SDK producer can ship compiler with a set of appropriate
options or prepare them during installation.
In this case if clang is called as `target-clang-drivermode`, it
first tries to find file `target-drivermode.cfg` or `target.cfg`
in a set of well-known directories, which in minimal case
includes the directory where clang executable resides. If such
file is found, options are  read from it, otherwise only option
--target is added as clang does it now.

This solution has obvious drawbacks:
- User cannot specify config file in command line in the same way
as he can choose a target: `clang --target `,
- On Windows symlinks are implemented as file copy, the solution
looks awkward.
So more or less complete solution needs to allow specifying
config file in command line.


I'd rather not reduce the patch in this way, and you didn't
describe why you're considering reducing the patch. Can you please
elaborate?


The only intent was to facilitate review process.


As someone who's worked on reviewing the patches, I don't think this 
makes things any easier or harder. Once we decide on what we want to do, 
the rest of the review process should be straightforward.




Using `@file` has some problems. Config file is merely a set of
options, just as file included by `@file`. Different include file
search is only a convenience and could be sacrificed. Comments
and unused option warning suppression could be extended for all
files included with `@file`. The real problem is the search path.
To be useful, config files must be searched for in well-known
directories, so that meaning of `clang @config_fille` does not
depend on the current directory. So clang must have some rule to
distinguish between config file and traditional use of `@file`.
For instance, if file name ends with `.cfg` and there is a file
with this name in config search directories, this is a config
file and it is interpreted a bit differently. Of course, the file
may be specified with full path, but this way is inconvenient.


I see no reason why we can't unify the processing but have
different search-path rules for @file vs. --config file.


Now I think we can use @file without breaking compatibility.

libiberty resolves `file` in `@file` always relative to current 
directory. If such file is not found, it tries to open file with name 
`@file`. We must keep this behavior for the sake of compatibility. If 
after these steps `file` is not found and `file` does not contain 
directory separator, clang could try to treat `file` as config file 
and search it using special search path. If such solution is 
acceptable, we can get rid of `--config`.


I think that I'd prefer --config to this scheme. For one thing, it means 
that if I have a wrapper script that adds --config foo, this will break 
if the user happens to have a file named foo in their directory. I think 
that unifying the implementation of @foo and --config foo is a good 
idea, but combining them all into the same interface is not obviously 
optimal.


Thanks again,
Hal




Another possible solution is to extend meaning of `--target` so
that it fully matches with the use of `target-clang-drivermode`,
that is the option `--target=hexagon` causes clang first to look
for the file `hexagon.cfg` in well-known directories and use it
if found. In this case treatment of `--target` is different if
the option is specified in command line or in the content of
config file (in the latter case it is processed as target name
only), it may be confusing. Besides, use of config files is not
restricted to the choice of target.


I think we should do this, so long as the implementation is
reasonable, and the special case doesn't bother me in this regard.
I don't view this as a replacement for '--config file', however,
because, as you mention, the config files need not be restricted
to target triples.


Different treatment of  `--target` in config file and in command line 
is still a concern, to do or not to do this depends on which is looks 
more intuitive. I would try implementing it is a separate patch.


Thanks,
--Serge


Thanks again,
Hal



Using special option for config files does not bring risk of
compatibility breakage and does not change meaning of existing
options.


Thanks,
--Serge

2017-05-10 11:25 GMT+07:00 Serge Pavlov mailto:sepavl...@gmail.com>>:

 

r311041 - Base optimization-record file names on the final output

2017-08-16 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Wed Aug 16 14:34:27 2017
New Revision: 311041

URL: http://llvm.org/viewvc/llvm-project?rev=311041&view=rev
Log:
Base optimization-record file names on the final output

Using Output.getFilename() to construct the file name used for optimization
recording in Clang::ConstructJob, when -c is provided, does not work correctly
if we're not using the integrated assembler. With -no-integrated-as (or
-save-temps) Output.getFilename() gives the name of the temporary assembly
file, not the final output file. Instead, use the final output (as provided by
-o). If this is not available, then fall back to using a name based on the
input file.

Fixes PR31532.

Modified:
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/test/Driver/opt-record.c

Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=311041&r1=311040&r2=311041&view=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Wed Aug 16 14:34:27 2017
@@ -4261,10 +4261,13 @@ void Clang::ConstructJob(Compilation &C,
   CmdArgs.push_back(A->getValue());
 } else {
   SmallString<128> F;
-  if (Output.isFilename() && (Args.hasArg(options::OPT_c) ||
-  Args.hasArg(options::OPT_S))) {
-F = Output.getFilename();
-  } else {
+
+  if (Args.hasArg(options::OPT_c) || Args.hasArg(options::OPT_S)) {
+if (Arg *FinalOutput = Args.getLastArg(options::OPT_o))
+  F = FinalOutput->getValue();
+  }
+
+  if (F.empty()) {
 // Use the input filename.
 F = llvm::sys::path::stem(Input.getBaseInput());
 

Modified: cfe/trunk/test/Driver/opt-record.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/opt-record.c?rev=311041&r1=311040&r2=311041&view=diff
==
--- cfe/trunk/test/Driver/opt-record.c (original)
+++ cfe/trunk/test/Driver/opt-record.c Wed Aug 16 14:34:27 2017
@@ -1,6 +1,13 @@
 // RUN: %clang -### -S -o FOO -fsave-optimization-record %s 2>&1 | FileCheck %s
 // RUN: %clang -### -c -o FOO -fsave-optimization-record %s 2>&1 | FileCheck %s
+// RUN: %clang -### -c -o FOO.o -fsave-optimization-record %s 2>&1 | FileCheck 
%s
+// RUN: %clang -### -no-integrated-as -S -o FOO -fsave-optimization-record %s 
2>&1 | FileCheck %s
+// RUN: %clang -### -no-integrated-as -c -o FOO.o -fsave-optimization-record 
%s 2>&1 | FileCheck %s
+// RUN: %clang -### -save-temps -S -o FOO -fsave-optimization-record %s 2>&1 | 
FileCheck %s
+// RUN: %clang -### -save-temps -c -o FOO.o -fsave-optimization-record %s 2>&1 
| FileCheck %s
 // RUN: %clang -### -c -fsave-optimization-record %s 2>&1 | FileCheck %s 
-check-prefix=CHECK-NO-O
+// RUN: %clang -### -no-integrated-as -c -fsave-optimization-record %s 2>&1 | 
FileCheck %s -check-prefix=CHECK-NO-O
+// RUN: %clang -### -save-temps -c -fsave-optimization-record %s 2>&1 | 
FileCheck %s -check-prefix=CHECK-NO-O
 // RUN: %clang -### -fsave-optimization-record %s 2>&1 | FileCheck %s 
-check-prefix=CHECK-NO-O
 // RUN: %clang -### -S -fsave-optimization-record -x cuda -nocudainc 
-nocudalib %s 2>&1 | FileCheck %s -check-prefix=CHECK-NO-O 
-check-prefix=CHECK-CUDA-DEV
 // RUN: %clang -### -fsave-optimization-record -x cuda -nocudainc -nocudalib 
%s 2>&1 | FileCheck %s -check-prefix=CHECK-NO-O -check-prefix=CHECK-CUDA-DEV


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r311043 - Don't use -no-integrated-as in test/Driver/opt-record.c

2017-08-16 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Wed Aug 16 14:51:31 2017
New Revision: 311043

URL: http://llvm.org/viewvc/llvm-project?rev=311043&view=rev
Log:
Don't use -no-integrated-as in test/Driver/opt-record.c

-no-integrated-as is not supported on some targets (e.g.,
x86_64-pc-windows-msvc). Testing using -save-temps is good enough to cover the
relevant logic, and that should work everywhere.

Modified:
cfe/trunk/test/Driver/opt-record.c

Modified: cfe/trunk/test/Driver/opt-record.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/opt-record.c?rev=311043&r1=311042&r2=311043&view=diff
==
--- cfe/trunk/test/Driver/opt-record.c (original)
+++ cfe/trunk/test/Driver/opt-record.c Wed Aug 16 14:51:31 2017
@@ -1,12 +1,9 @@
 // RUN: %clang -### -S -o FOO -fsave-optimization-record %s 2>&1 | FileCheck %s
 // RUN: %clang -### -c -o FOO -fsave-optimization-record %s 2>&1 | FileCheck %s
 // RUN: %clang -### -c -o FOO.o -fsave-optimization-record %s 2>&1 | FileCheck 
%s
-// RUN: %clang -### -no-integrated-as -S -o FOO -fsave-optimization-record %s 
2>&1 | FileCheck %s
-// RUN: %clang -### -no-integrated-as -c -o FOO.o -fsave-optimization-record 
%s 2>&1 | FileCheck %s
 // RUN: %clang -### -save-temps -S -o FOO -fsave-optimization-record %s 2>&1 | 
FileCheck %s
 // RUN: %clang -### -save-temps -c -o FOO.o -fsave-optimization-record %s 2>&1 
| FileCheck %s
 // RUN: %clang -### -c -fsave-optimization-record %s 2>&1 | FileCheck %s 
-check-prefix=CHECK-NO-O
-// RUN: %clang -### -no-integrated-as -c -fsave-optimization-record %s 2>&1 | 
FileCheck %s -check-prefix=CHECK-NO-O
 // RUN: %clang -### -save-temps -c -fsave-optimization-record %s 2>&1 | 
FileCheck %s -check-prefix=CHECK-NO-O
 // RUN: %clang -### -fsave-optimization-record %s 2>&1 | FileCheck %s 
-check-prefix=CHECK-NO-O
 // RUN: %clang -### -S -fsave-optimization-record -x cuda -nocudainc 
-nocudalib %s 2>&1 | FileCheck %s -check-prefix=CHECK-NO-O 
-check-prefix=CHECK-CUDA-DEV


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D36562: [Bitfield] Make the bitfield a separate location if it has width of legal integer type and its bit offset is naturally aligned for the type

2017-08-22 Thread Hal Finkel via cfe-commits


On 08/22/2017 09:18 PM, Xinliang David Li via llvm-commits wrote:



On Tue, Aug 22, 2017 at 7:10 PM, Chandler Carruth via llvm-commits 
mailto:llvm-comm...@lists.llvm.org>> wrote:


On Tue, Aug 22, 2017 at 7:03 PM Xinliang David Li via cfe-commits
mailto:cfe-commits@lists.llvm.org>>
wrote:

On Tue, Aug 22, 2017 at 6:37 PM, Chandler Carruth via
Phabricator mailto:revi...@reviews.llvm.org>> wrote:

chandlerc added a comment.

I'm really not a fan of the degree of complexity and
subtlety that this introduces into the frontend, all to
allow particular backend optimizations.

I feel like this is Clang working around a fundamental
deficiency in LLVM and we should instead find a way to fix
this in LLVM itself.

As has been pointed out before, user code can synthesize
large integers that small bit sequences are extracted
from, and Clang and LLVM should handle those just as well
as actual bitfields.

Can we see how far we can push the LLVM side before we add
complexity to Clang here? I understand that there remain
challenges to LLVM's stuff, but I don't think those
challenges make *all* of the LLVM improvements off the
table, I don't think we've exhausted all ways of improving
the LLVM changes being proposed, and I think we should
still land all of those and re-evaluate how important
these issues are when all of that is in place.


The main challenge of doing  this in LLVM is that
inter-procedural analysis (and possibly cross module) is
needed (for store forwarding issues).

Wei, perhaps you can provide concrete test case to illustrate
the issue so that reviewers have a good understanding.


It doesn't seem like all options for addressing that have been
exhausted. And even then, I feel like trying to fix this with
non-obvious (to the programmer) frontend heuristics isn't a good
solution. I actually *prefer* the source work around of "don't use
a bitfield if you *must* have narrow width access across modules
where the optimizer cannot see enough to narrow them and you
happen to know that there is a legal narrow access that works".
Because that way the programmer has *control* over this rather
than being at the whim of whichever side of the heuristic they end
up on.



The source workaround solution *does not* scale. Most importantly, 
user may not even be aware of the problem (and performance loss) 
unless  compiling the code with another compiler and notice the 
performance difference.


I agree with this, but it's not clear that this has to scale in that 
sense. I don't like basing this on the bitfield widths because it makes 
users pick between expressing semantic information and expressing target 
tuning information using the same construct. What if the optimal answer 
here is different on different platforms? I don't want to encourage 
users to ifdef their aggregates to sometimes be bitfields and sometimes 
not for tuning reasons. If need be, please add an attribute. Any 
heuristic that you pick here is going to help some cases and hurt 
others. If we're at the level of needing IPA to look at store-to-load 
forwarding effects, then we've really already lost. Either you need to 
actually do the IPA, or even in the backend, any heuristic that you 
choose will help some things and hurt others. Hopefully, we're not 
really there yet. I'm looking forward to seeing more examples of the 
kinds of problems you're trying to solve.


Thanks again,
Hal



David


David



Repository:
  rL LLVM

https://reviews.llvm.org/D36562




___
cfe-commits mailing list
cfe-commits@lists.llvm.org 
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits



___
llvm-commits mailing list
llvm-comm...@lists.llvm.org 
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits





___
llvm-commits mailing list
llvm-comm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits


--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r289752 - Include SmallSet.h in BackendUtil.cpp

2016-12-14 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Wed Dec 14 20:19:17 2016
New Revision: 289752

URL: http://llvm.org/viewvc/llvm-project?rev=289752&view=rev
Log:
Include SmallSet.h in BackendUtil.cpp

BackendUtil.cpp uses llvm::SmallSet but did not include the header. It was
included indirectly, but this will change once the AssumptionCache is removed.
NFC.

Modified:
cfe/trunk/lib/CodeGen/BackendUtil.cpp

Modified: cfe/trunk/lib/CodeGen/BackendUtil.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/BackendUtil.cpp?rev=289752&r1=289751&r2=289752&view=diff
==
--- cfe/trunk/lib/CodeGen/BackendUtil.cpp (original)
+++ cfe/trunk/lib/CodeGen/BackendUtil.cpp Wed Dec 14 20:19:17 2016
@@ -14,6 +14,7 @@
 #include "clang/Frontend/CodeGenOptions.h"
 #include "clang/Frontend/FrontendDiagnostic.h"
 #include "clang/Frontend/Utils.h"
+#include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringSwitch.h"
 #include "llvm/ADT/Triple.h"


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r291123 - CodeGen: plumb header search down to the IAS

2017-01-06 Thread Hal Finkel via cfe-commits


On 01/05/2017 08:30 PM, Eric Christopher via cfe-commits wrote:
Ok, thanks. I agree that it's a problem. I'm definitely open for 
testing ideas here. There are a few other things in the 
TargetOptions/MCTargetOptions area that are already problematic to test.


I think that we need to add serialization for these structures, and a 
printing option for them, so that we can test these kinds of things. 
That having been said, a lot of these things need to end up in 
attributes so that they work correctly with LTO. Is this one of them?


 -Hal



-eric

On Thu, Jan 5, 2017 at 6:27 PM Saleem Abdulrasool 
mailto:compn...@compnerd.org>> wrote:


This was certainly the problem that I had.  The test really needs
a way to check that the field was set.  As you state, this is a
problematic area.  The backend already has a test to ensure that
the paths are honored, but, I didn't see any way to actually
ensure that it was getting sent to the backend otherwise.

The module itself doesnt encode the search path, nor is the
information in the command line. I can see the argument that the
test itself doesn't add much value especially with the backend
side testing that the processing of the inclusion does occur
correctly. I'll go ahead and remove the test (which already has
ended up being a pain to test).

On Thu, Jan 5, 2017 at 6:11 PM, Eric Christopher
mailto:echri...@gmail.com>> wrote:

Hi Saleem,

Love that you wanted to add a test for it, but I'd really
prefer that you not engage the backend here in order to do it.
You can verify some of it from the backend and just that the
module is correct via the front end if you'd like. Ensuring
the paths are correct is a bit of a sticky problem, but this
is an API boundary that we just have problems with.

TL;DR: Would you mind splitting this test into front end and
back end tests and avoid using the backend in clang's test
harness?

Thanks!

-eric

On Thu, Jan 5, 2017 at 8:13 AM Saleem Abdulrasool via
cfe-commits mailto:cfe-commits@lists.llvm.org>> wrote:

Author: compnerd
Date: Thu Jan  5 10:02:32 2017
New Revision: 291123

URL: http://llvm.org/viewvc/llvm-project?rev=291123&view=rev
Log:
CodeGen: plumb header search down to the IAS

inline assembly may use the `.include` directive to
include other
content into the file.  Without the integrated assembler,
the `-I` group
gets passed to the assembler.  Emulate this by collecting
the header
search paths and passing them to the IAS.

Resolves PR24811!

Added:
cfe/trunk/test/CodeGen/include/
cfe/trunk/test/CodeGen/include/function.x
cfe/trunk/test/CodeGen/include/module.x
cfe/trunk/test/CodeGen/inline-asm-inclusion.c
Modified:
cfe/trunk/include/clang/CodeGen/BackendUtil.h
cfe/trunk/lib/CodeGen/BackendUtil.cpp
cfe/trunk/lib/CodeGen/CodeGenAction.cpp
cfe/trunk/lib/CodeGen/ObjectFilePCHContainerOperations.cpp

Modified: cfe/trunk/include/clang/CodeGen/BackendUtil.h
URL:

http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/CodeGen/BackendUtil.h?rev=291123&r1=291122&r2=291123&view=diff

==
--- cfe/trunk/include/clang/CodeGen/BackendUtil.h (original)
+++ cfe/trunk/include/clang/CodeGen/BackendUtil.h Thu Jan 
5 10:02:32 2017

@@ -21,6 +21,7 @@ namespace llvm {

 namespace clang {
   class DiagnosticsEngine;
+  class HeaderSearchOptions;
   class CodeGenOptions;
   class TargetOptions;
   class LangOptions;
@@ -34,7 +35,8 @@ namespace clang {
 Backend_EmitObj///< Emit native object files
   };

-  void EmitBackendOutput(DiagnosticsEngine &Diags, const
CodeGenOptions &CGOpts,
+  void EmitBackendOutput(DiagnosticsEngine &Diags, const
HeaderSearchOptions &,
+ const CodeGenOptions &CGOpts,
  const TargetOptions &TOpts,
const LangOptions &LOpts,
  const llvm::DataLayout &TDesc,
llvm::Module *M,
  BackendAction Action,

Modified: cfe/trunk/lib/CodeGen/BackendUtil.cpp
URL:

http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/BackendUtil.cpp?rev=291123&r1=291122&r2=291123&view=diff

=

Re: [PATCH] D9403: llvm.noalias - Clang CodeGen for local restrict-qualified pointers

2016-08-16 Thread Hal Finkel via cfe-commits
hfinkel added inline comments.


Comment at: lib/CodeGen/CGStmt.cpp:537
@@ +536,3 @@
+  llvm::LLVMContext::MD_noalias),
+  NewScopeList));
+

rjmccall wrote:
> This is a very strange representation.  Every memory operation in the lexical 
> block is annotated with a list of all of the scopes that were entered within 
> the block, even if they were entered after the operation.  But for some 
> reason, not with nested scopes?
> 
> What's the right patch for me to read about this representation?
Perhaps unfortunately, this is an artifact of the way that restrict is defined 
in C. It applies to all accesses in the block in which the variable is 
declared, even those before the declaration of the restrict-qualified local 
itself.

It should work with nested scopes, in the sense that we add these things as we 
complete each scope. So we add things to the inner scope, and then when we 
complete the outer scope, we go back over the instructions (including those in 
the inner scope because the scope recording recurses up the scope hierarchy), 
and adds the outer scopes - it concatenates them to any added by the inner 
(nested) scopes.

The noalias intrinsic's LangRef updates are in D9375.


https://reviews.llvm.org/D9403



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D18639: Use __builtin_isnan/isinf/isfinite in complex

2016-08-27 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D18639#515000, @hfinkel wrote:

> Updated to use scheme suggested by Marshall.


Ping.


https://reviews.llvm.org/D18639



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r280041 - [PowerPC] Add support for -mlongcall

2016-08-29 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Mon Aug 29 20:07:03 2016
New Revision: 280041

URL: http://llvm.org/viewvc/llvm-project?rev=280041&view=rev
Log:
[PowerPC] Add support for -mlongcall

Add support for GCC's PowerPC -mlongcall option; the backend supports the
corresponding target feature as of r280040.

Fixes PR19098.

Modified:
cfe/trunk/include/clang/Driver/Options.td
cfe/trunk/test/Driver/ppc-features.cpp

Modified: cfe/trunk/include/clang/Driver/Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=280041&r1=280040&r2=280041&view=diff
==
--- cfe/trunk/include/clang/Driver/Options.td (original)
+++ cfe/trunk/include/clang/Driver/Options.td Mon Aug 29 20:07:03 2016
@@ -1598,6 +1598,10 @@ def mfloat128: Flag<["-"], "mfloat128">,
 Group;
 def mno_float128 : Flag<["-"], "mno-float128">,
 Group;
+def mlongcall: Flag<["-"], "mlongcall">,
+Group;
+def mno_longcall : Flag<["-"], "mno-longcall">,
+Group;
 
 def faltivec : Flag<["-"], "faltivec">, Group, Flags<[CC1Option]>,
   HelpText<"Enable AltiVec vector initializer syntax">;

Modified: cfe/trunk/test/Driver/ppc-features.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/ppc-features.cpp?rev=280041&r1=280040&r2=280041&view=diff
==
--- cfe/trunk/test/Driver/ppc-features.cpp (original)
+++ cfe/trunk/test/Driver/ppc-features.cpp Mon Aug 29 20:07:03 2016
@@ -163,6 +163,12 @@
 // RUN: %clang -target powerpc64-unknown-linux-gnu %s -mno-crbits -mcrbits 
-### -o %t.o 2>&1 | FileCheck -check-prefix=CHECK-CRBITS %s
 // CHECK-CRBITS: "-target-feature" "+crbits"
 
+// RUN: %clang -target powerpc64-unknown-linux-gnu %s -mno-longcall -### -o 
%t.o 2>&1 | FileCheck -check-prefix=CHECK-NOLONGCALL %s
+// CHECK-NOLONGCALL: "-target-feature" "-longcall"
+
+// RUN: %clang -target powerpc64-unknown-linux-gnu %s -mno-longcall -mlongcall 
-### -o %t.o 2>&1 | FileCheck -check-prefix=CHECK-LONGCALL %s
+// CHECK-LONGCALL: "-target-feature" "+longcall"
+
 // RUN: %clang -target powerpc64-unknown-linux-gnu %s 
-mno-invariant-function-descriptors -### -o %t.o 2>&1 | FileCheck 
-check-prefix=CHECK-NOINVFUNCDESC %s
 // CHECK-NOINVFUNCDESC: "-target-feature" "-invariant-function-descriptors"
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r280053 - [PowerPC] Update the DWARF register-size table

2016-08-29 Thread Hal Finkel via cfe-commits
Author: hfinkel
Date: Mon Aug 29 21:38:34 2016
New Revision: 280053

URL: http://llvm.org/viewvc/llvm-project?rev=280053&view=rev
Log:
[PowerPC] Update the DWARF register-size table

The PPC64 DWARF register-size table did not match the ABI specification (or
GCC, for that matter). Fix that, and add a regression test.

Fixes PR27931.

Added:
cfe/trunk/test/CodeGen/ppc64-dwarf.c
Modified:
cfe/trunk/lib/CodeGen/TargetInfo.cpp

Modified: cfe/trunk/lib/CodeGen/TargetInfo.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/TargetInfo.cpp?rev=280053&r1=280052&r2=280053&view=diff
==
--- cfe/trunk/lib/CodeGen/TargetInfo.cpp (original)
+++ cfe/trunk/lib/CodeGen/TargetInfo.cpp Mon Aug 29 21:38:34 2016
@@ -4413,14 +4413,17 @@ PPC64_initDwarfEHRegSizeTable(CodeGen::C
   // 32-63: fp0-31, the 8-byte floating-point registers
   AssignToArrayRange(Builder, Address, Eight8, 32, 63);
 
-  // 64-76 are various 4-byte special-purpose registers:
+  // 64-67 are various 8-byte special-purpose registers:
   // 64: mq
   // 65: lr
   // 66: ctr
   // 67: ap
+  AssignToArrayRange(Builder, Address, Eight8, 64, 67);
+
+  // 68-76 are various 4-byte special-purpose registers:
   // 68-75 cr0-7
   // 76: xer
-  AssignToArrayRange(Builder, Address, Four8, 64, 76);
+  AssignToArrayRange(Builder, Address, Four8, 68, 76);
 
   // 77-108: v0-31, the 16-byte vector registers
   AssignToArrayRange(Builder, Address, Sixteen8, 77, 108);
@@ -4430,7 +4433,10 @@ PPC64_initDwarfEHRegSizeTable(CodeGen::C
   // 111: spe_acc
   // 112: spefscr
   // 113: sfp
-  AssignToArrayRange(Builder, Address, Four8, 109, 113);
+  // 114: tfhar
+  // 115: tfiar
+  // 116: texasr
+  AssignToArrayRange(Builder, Address, Eight8, 109, 116);
 
   return false;
 }

Added: cfe/trunk/test/CodeGen/ppc64-dwarf.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/ppc64-dwarf.c?rev=280053&view=auto
==
--- cfe/trunk/test/CodeGen/ppc64-dwarf.c (added)
+++ cfe/trunk/test/CodeGen/ppc64-dwarf.c Mon Aug 29 21:38:34 2016
@@ -0,0 +1,129 @@
+// RUN: %clang_cc1 -triple powerpc64-unknown-unknown -emit-llvm %s -o - | 
FileCheck %s
+static unsigned char dwarf_reg_size_table[1024];
+
+int test() {
+  __builtin_init_dwarf_reg_size_table(dwarf_reg_size_table);
+
+  return __builtin_dwarf_sp_column();
+}
+
+// CHECK-LABEL: define signext i32 @test()
+// CHECK:  store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 0), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 1), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 2), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 3), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 4), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 5), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 6), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 7), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 8), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 9), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 10), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 11), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 12), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 13), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 14), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 15), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 16), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 17), align 1
+// CHECK-NEXT: store i8 8, i8* getelementptr inbounds ([1024 x i8], [1024 x 
i8]* @dwarf_reg_size_table, i32 0, i32 18), align 1
+// CHECK-NEXT: store i8 8, i8* getelementpt

Re: [PATCH] D18639: Use __builtin_isnan/isinf/isfinite in complex

2016-09-03 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D18639#527285, @hfinkel wrote:

> In https://reviews.llvm.org/D18639#515000, @hfinkel wrote:
>
> > Updated to use scheme suggested by Marshall.
>
>
> Ping.


Ping.


https://reviews.llvm.org/D18639



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: r253269 - Make FP_CONTRACT ON the default.

2016-09-06 Thread Hal Finkel via cfe-commits
Hi Steve, et al.,

It looks like this crasher was fixed in r254573. Should we move forward with 
recommitting this now?

Thanks again,
Hal

- Original Message -
> From: "Manuel Klimek" 
> To: "Hal Finkel" , "Renato Golin" 
> Cc: "Clang Commits" 
> Sent: Tuesday, November 17, 2015 9:47:11 AM
> Subject: Re: r253269 - Make FP_CONTRACT ON the default.
> 
> 
> Reverted in r253337. Failing test case in commit message.
> 
> 
> 
> On Tue, Nov 17, 2015 at 4:39 PM Manuel Klimek < kli...@google.com >
> wrote:
> 
> Repro:
> float foo(float U, float base, float cell) { return (U = 2 * base) -
> cell; }
> Preparing rollback of the CL.
> 
> 
> On Tue, Nov 17, 2015 at 2:46 PM Manuel Klimek < kli...@google.com >
> wrote:
> 
> 
> 
> Note that due to this change we're hitting an assert at
> lib/CodeGen/CGExprScalar.cpp:2570 in llvm::Value
> *tryEmitFMulAdd(const (anonymous namespace)::BinOpInfo &, const
> clang::CodeGen::CodeGenFunction &, clang::CodeGen::CGBuilderTy &,
> bool): LHSBinOp->getNumUses(
> ) == 0 && "Operations with multiple uses shouldn't be contracted."
> 
> 
> Don't have a small repro yet :(
> 
> 
> On Tue, Nov 17, 2015 at 1:39 PM Hal Finkel via cfe-commits <
> cfe-commits@lists.llvm.org > wrote:
> 
> 
> - Original Message -
> > From: "Renato Golin via cfe-commits" < cfe-commits@lists.llvm.org >
> > To: "Stephen Canon" < sca...@apple.com >
> > Cc: "Clang Commits" < cfe-commits@lists.llvm.org >
> > Sent: Tuesday, November 17, 2015 3:51:23 AM
> > Subject: Re: r253269 - Make FP_CONTRACT ON the default.
> > 
> > On 16 November 2015 at 23:09, Stephen Canon via cfe-commits
> > < cfe-commits@lists.llvm.org > wrote:
> > > Author: scanon
> > > Date: Mon Nov 16 17:09:11 2015
> > > New Revision: 253269
> > > 
> > > URL: http://llvm.org/viewvc/llvm-project?rev=253269&view=rev
> > > Log:
> > > Make FP_CONTRACT ON the default.
> > > 
> > > Differential Revision: D14200
> > 
> > Hi Stephen,
> > 
> > It seems your commit in the blame list is the only one that affects
> > AArch64 directly:
> > 
> > http://lab.llvm.org:8011/builders/clang-cmake-aarch64-quick/builds/2388
> > 
> > I haven't bisected yet, but would be good if you could try those
> > tests
> > locally, just to make sure it wasn't your commit, and revert if it
> > was, to fix offline.
> 
> The test suite already has logic to add -ffp-contract=off on PowerPC
> so that we can compare to the binary outputs. We may need to do this
> now for all targets, at least until be come up with a better
> solution.
> 
> -Hal
> 
> > 
> > cheers,
> > --renato
> > ___
> > cfe-commits mailing list
> > cfe-commits@lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D18639: Use __builtin_isnan/isinf/isfinite in complex

2016-09-16 Thread Hal Finkel via cfe-commits
hfinkel added a comment.

In https://reviews.llvm.org/D18639#533807, @hfinkel wrote:

> In https://reviews.llvm.org/D18639#527285, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D18639#515000, @hfinkel wrote:
> >
> > > Updated to use scheme suggested by Marshall.
> >
> >
> > Ping.
>
>
> Ping.


Ping.


https://reviews.llvm.org/D18639



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D30415: Fix -mno-altivec cannot overwrite -maltivec option

2017-03-16 Thread Hal Finkel via cfe-commits


On 03/16/2017 07:40 PM, Eric Christopher wrote:



On Thu, Mar 16, 2017 at 5:37 PM Hal Finkel via Phabricator 
mailto:revi...@reviews.llvm.org>> wrote:


hfinkel added a comment.

In https://reviews.llvm.org/D30415#703398, @echristo wrote:

> Different suggestion:
>
> Remove the faltivec option. Even gcc doesn't support it anymore
afaict.


What are you suggesting? Always having the language extensions on?
Or explicitly tying the language extensions to the underlying
target feature?


I was thinking the latter.


Is that what GCC now does?

 -Hal



-eric

> (Go ahead and commit the zvector part if you'd like).
>
> -eric




https://reviews.llvm.org/D30415





--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D30415: Fix -mno-altivec cannot overwrite -maltivec option

2017-03-16 Thread Hal Finkel via cfe-commits


On 03/16/2017 08:11 PM, Eric Christopher wrote:



On Thu, Mar 16, 2017 at 5:45 PM Hal Finkel > wrote:



On 03/16/2017 07:40 PM, Eric Christopher wrote:



On Thu, Mar 16, 2017 at 5:37 PM Hal Finkel via Phabricator
mailto:revi...@reviews.llvm.org>> wrote:

hfinkel added a comment.

In https://reviews.llvm.org/D30415#703398, @echristo wrote:

> Different suggestion:
>
> Remove the faltivec option. Even gcc doesn't support it
anymore afaict.


What are you suggesting? Always having the language
extensions on? Or explicitly tying the language extensions to
the underlying target feature?


I was thinking the latter.


Is that what GCC now does?


That would be my guess given the option isn't listed anymore, but what 
it does is this:


echristo@dzur ~/s/gcc-git> grep -r faltivec *
gcc/testsuite/gcc.target/powerpc/stabs-attrib-vect-darwin.c:/* { 
dg-options "-gstabs+ -fno-eliminate-unused-debug-types -faltivec" } */
gcc/testsuite/ChangeLog-1993-2007:* g++.dg/ext/altivec-8.C: Use 
'-maltivec' instead of '-faltivec';
gcc/config/rs6000/darwin.h:   the kernel or some such. The "-faltivec" 
option should have been
gcc/config/rs6000/darwin.h:  %{faltivec:-maltivec -include altivec.h} 
%{fno-altivec:-mno-altivec} \

gcc/config/rs6000/darwin.h:  %gcc/ChangeLog-2010:* config/rs6000/darwin.h (CC1_SPEC): Handle 
-faltivec and -fno-altivec.
gcc/ChangeLog-2010:* config/rs6000/darwin.opt 
(Waltivec-long-deprecated, faltivec,


and only on darwin. I don't see anything that treats the faltivec 
alias as anything language specific anywhere. It basically just says 
"pass the include and turn on maltivec".


At this point I'm pretty sure that -faltivec can just be ignored.


It is certainly fair to say that having the ability to use -fno-altivec 
was much more important when -faltivec included altivec.h (which 
injected names like vec_add into the global namespace). I'm fine with 
enabling the vector syntax extensions when targeting altivec is enabled 
(they're extensions to extensions anyway).


 -Hal



-eric

 -Hal




-eric

> (Go ahead and commit the zvector part if you'd like).
>
> -eric




https://reviews.llvm.org/D30415





-- 
Hal Finkel

Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory



--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D30415: Fix -mno-altivec cannot overwrite -maltivec option

2017-03-21 Thread Hal Finkel via cfe-commits


On 03/21/2017 12:19 PM, Ulrich Weigand via Phabricator wrote:

uweigand added a comment.

In https://reviews.llvm.org/D30415#705889, @echristo wrote:


In https://reviews.llvm.org/D30415#705196, @uweigand wrote:


Well, mainline GCC doesn't have -faltivec at all and never had, I think this 
was only an Apple GCC extension ...  Not sure what exactly the semantics of 
that was.


Sure it does and has for years. Check out rs6000/darwin.h :)

FWIW: It turns on maltivec and adds a -include of altivec.h


Huh, I wasn't aware of that feature on Darwin, thanks for pointing it out ...


Nearly all of the code in lib/Driver/ToolChains/Clang.cpp and 
lib/Driver/ToolChains/Arch/PPC.cpp that deal with altivec. Simplifying the 
interface by getting rid of needing to check multiple options.

But why would that code no longer be necessary for -maltivec?  Well, I guess 
I'll wait for your patch ...

If we indeed get rid of -faltivec, I'm wondering whether it would also make 
sense to get rid of -fzvector.  This is just an alias for -mzvector, and it 
isn't supported by GCC either.  I added it only because Richard Smith 
specifically asked for it when I contributed the feature here:
https://reviews.llvm.org/D11001


Perhaps what consistency giveth, consistency shall taketh away.

 -Hal




This should be a -f flag, not a -m flag. (I think we only support -maltivec for 
GCC compatibility.)




https://reviews.llvm.org/D30415





--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [libcxxabi] r296940 - Fix PR25874 - Detect features required for cxa_thread_atexit_test.pass.cpp

2017-04-12 Thread Hal Finkel via cfe-commits

Hi Eric,

This does not seem to do the right thing because, at this point, we have 
a fall-back implementation of __cxa_thread_atexit_impl (in 
src/cxa_thread_atexit.cpp), and this will be compiled if libc does not 
provide an implementation. Thus, the test will always pass (unless 
LIBCXXABI_ENABLE_THREADS is not defined, but we already check for that). 
I'm seeing this test unexpectedly pass on older systems. PR25874 should 
be fixed just by having the fallback implementation.


As a result, I think that we can just revert this entirely.

Thanks again,
Hal

On 03/03/2017 07:26 PM, Eric Fiselier via cfe-commits wrote:

Author: ericwf
Date: Fri Mar  3 19:26:41 2017
New Revision: 296940

URL: http://llvm.org/viewvc/llvm-project?rev=296940&view=rev
Log:
Fix PR25874 - Detect features required for cxa_thread_atexit_test.pass.cpp

Modified:
 libcxxabi/trunk/test/CMakeLists.txt
 libcxxabi/trunk/test/cxa_thread_atexit_test.pass.cpp
 libcxxabi/trunk/test/libcxxabi/test/config.py
 libcxxabi/trunk/test/lit.site.cfg.in

Modified: libcxxabi/trunk/test/CMakeLists.txt
URL: 
http://llvm.org/viewvc/llvm-project/libcxxabi/trunk/test/CMakeLists.txt?rev=296940&r1=296939&r2=296940&view=diff
==
--- libcxxabi/trunk/test/CMakeLists.txt (original)
+++ libcxxabi/trunk/test/CMakeLists.txt Fri Mar  3 19:26:41 2017
@@ -18,6 +18,7 @@ pythonize_bool(LIBCXXABI_ENABLE_THREADS)
  pythonize_bool(LIBCXXABI_ENABLE_EXCEPTIONS)
  pythonize_bool(LIBCXXABI_USE_LLVM_UNWINDER)
  pythonize_bool(LIBCXXABI_BUILD_EXTERNAL_THREAD_LIBRARY)
+pythonize_bool(LIBCXXABI_HAS_CXA_THREAD_ATEXIT_IMPL)
  set(LIBCXXABI_TARGET_INFO "libcxx.test.target_info.LocalTI" CACHE STRING
  "TargetInfo to use when setting up test environment.")
  set(LIBCXXABI_EXECUTOR "None" CACHE STRING

Modified: libcxxabi/trunk/test/cxa_thread_atexit_test.pass.cpp
URL: 
http://llvm.org/viewvc/llvm-project/libcxxabi/trunk/test/cxa_thread_atexit_test.pass.cpp?rev=296940&r1=296939&r2=296940&view=diff
==
--- libcxxabi/trunk/test/cxa_thread_atexit_test.pass.cpp (original)
+++ libcxxabi/trunk/test/cxa_thread_atexit_test.pass.cpp Fri Mar  3 19:26:41 
2017
@@ -10,6 +10,11 @@
  // UNSUPPORTED: libcxxabi-no-threads
  // REQUIRES: linux
  
+// this test will only work if CMake detects a real __cxa_thread_atexit_impl

+// at configure time. This function, however, was added only in glibc 2.18,
+// and there are still plenty of systems only using 2.17 (Ex RHEL 7).
+// XFAIL: libcxxabi-no-cxa-thread-atexit-impl
+
  #include 
  #include 
  


Modified: libcxxabi/trunk/test/libcxxabi/test/config.py
URL: 
http://llvm.org/viewvc/llvm-project/libcxxabi/trunk/test/libcxxabi/test/config.py?rev=296940&r1=296939&r2=296940&view=diff
==
--- libcxxabi/trunk/test/libcxxabi/test/config.py (original)
+++ libcxxabi/trunk/test/libcxxabi/test/config.py Fri Mar  3 19:26:41 2017
@@ -45,6 +45,9 @@ class Configuration(LibcxxConfiguration)
  # test_exception_storage_nodynmem.pass.cpp fails under this specific 
configuration
  if self.get_lit_bool('cxx_ext_threads', False) and 
self.get_lit_bool('libcxxabi_shared', False):
  
self.config.available_features.add('libcxxabi-shared-externally-threaded')
+if not self.get_lit_bool('has_cxa_thread_atexit_impl', True):
+self.config.available_features.add(
+'libcxxabi-no-cxa-thread-atexit-impl')
  
  def configure_compile_flags(self):

  self.cxx.compile_flags += ['-DLIBCXXABI_NO_TIMER']

Modified: libcxxabi/trunk/test/lit.site.cfg.in
URL: 
http://llvm.org/viewvc/llvm-project/libcxxabi/trunk/test/lit.site.cfg.in?rev=296940&r1=296939&r2=296940&view=diff
==
--- libcxxabi/trunk/test/lit.site.cfg.in (original)
+++ libcxxabi/trunk/test/lit.site.cfg.in Fri Mar  3 19:26:41 2017
@@ -20,6 +20,7 @@ config.host_triple  = "@LLVM
  config.target_triple= "@TARGET_TRIPLE@"
  config.use_target   = len("@LIBCXXABI_TARGET_TRIPLE@") > 0
  config.cxx_ext_threads  = "@LIBCXXABI_BUILD_EXTERNAL_THREAD_LIBRARY@"
+config.has_cxa_thread_atexit_impl = "@LIBCXXABI_HAS_CXA_THREAD_ATEXIT_IMPL@"
  
  # Let the main config do the real work.

  lit_config.load_config(config, "@LIBCXXABI_SOURCE_DIR@/test/lit.cfg")


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [cfe-commits] r164177 - in /cfe/trunk: lib/Driver/ test/Driver/ test/Driver/Inputs/freescale_ppc_tree/ test/Driver/Inputs/freescale_ppc_tree/lib/ test/Driver/Inputs/freescale_ppc_tree/usr/ test/Dr

2017-02-09 Thread Hal Finkel via cfe-commits


On 02/08/2017 07:21 PM, Chandler Carruth wrote:

It's blast from the past time!

On Tue, Sep 18, 2012 at 3:28 PM Hal Finkel > wrote:


Author: hfinkel
Date: Tue Sep 18 17:25:07 2012
New Revision: 164177

URL: http://llvm.org/viewvc/llvm-project?rev=164177&view=rev
Log:
Add C/C++ header locations for the Freescale SDK.

The Freescale SDK is based on OpenEmbedded, and this might be useful
for other OpenEmbedded-based configurations as well.

With minor modifications, patch by Tobias von Koch!

Added:
cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/
cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/lib/
cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/lib/.keep
cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/
cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/
cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/crt1.o
cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/crti.o
cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/crtn.o
cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/powerpc-fsl-linux/

cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/powerpc-fsl-linux/4.6.2/

cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/powerpc-fsl-linux/4.6.2/crtbegin.o

cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/powerpc-fsl-linux/4.6.2/crtend.o
Modified:
cfe/trunk/lib/Driver/ToolChains.cpp
cfe/trunk/test/Driver/linux-ld.c

Modified: cfe/trunk/lib/Driver/ToolChains.cpp
URL:

http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains.cpp?rev=164177&r1=164176&r2=164177&view=diff

==
--- cfe/trunk/lib/Driver/ToolChains.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains.cpp Tue Sep 18 17:25:07 2012
@@ -1291,6 +1291,10 @@
 "/gcc/" + CandidateTriple.str(),
 "/" + CandidateTriple.str() + "/gcc/" + CandidateTriple.str(),

+// The Freescale PPC SDK has the gcc libraries in
+// /usr/lib//x.y.z so have a look there as well.
+"/" + CandidateTriple.str(),


So, this is really bad it turns out.

We use this directory to walk every installed GCC version. But because 
this is just a normal lib directory on many systems (ever Debian and 
Ubuntu system for example) this goes very badly. It goes even more 
badly because of the (questionable) design of LLVM's directory iterator:


It ends up stat'ing *every single file* in /usr/lib/  =[ 
For the current Ubuntu LTS for example, this causes roughly 3900 
spurrious stat syscalls for every invocation of the Clang driver.


Can we do something different here?


Wow. Hrmm, okay. Why are we stating every file? In any case, are we just 
searching for a directory with the right triple? Or are we searching for 
the version-number directory and doing that by looking at every entry?


 -Hal


+
 // Ubuntu has a strange mis-matched pair of triples that this
happens to
 // match.
 // FIXME: It may be worthwhile to generalize this and look
for a second
@@ -1300,6 +1304,7 @@
   const std::string InstallSuffixes[] = {
 "/../../..",
 "/../../../..",
+"/../..",
 "/../../../.."
   };
   // Only look at the final, weird Ubuntu suffix for i386-linux-gnu.
@@ -2374,6 +2379,9 @@
 InstallDir.str() + "/include/g++-v4",
 // Android standalone toolchain has C++ headers in yet
another place.
 LibDir.str() + "/../" + TripleStr.str() + "/include/c++/" +
Version.str(),
+// Freescale SDK C++ headers are directly in
/usr/include/c++,
+// without a subdirectory corresponding to the gcc version.
+LibDir.str() + "/../include/c++",
   };

   for (unsigned i = 0; i <
llvm::array_lengthof(IncludePathCandidates); ++i) {

Added: cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/lib/.keep
URL:

http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/lib/.keep?rev=164177&view=auto

==
(empty)

Added: cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/crt1.o
URL:

http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/crt1.o?rev=164177&view=auto

==
(empty)

Added: cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/crti.o
URL:

http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/crti.o?rev=164177&view=auto

==
(empty)

Added: cfe/trunk/test/Driver/Inputs/freescale_ppc_tree/usr/lib/crtn.o
URL:

http://llvm.org/v

Re: [cfe-commits] r164177 - in /cfe/trunk: lib/Driver/ test/Driver/ test/Driver/Inputs/freescale_ppc_tree/ test/Driver/Inputs/freescale_ppc_tree/lib/ test/Driver/Inputs/freescale_ppc_tree/usr/ test/Dr

2017-02-09 Thread Hal Finkel via cfe-commits


On 02/09/2017 04:58 PM, Chandler Carruth wrote:
On Thu, Feb 9, 2017 at 2:46 PM Tobias von Koch 
mailto:tobias.von.k...@gmail.com>> wrote:


On Wed, Feb 8, 2017 at 7:21 PM, Chandler Carruth
mailto:chandl...@gmail.com>> wrote:


+// The Freescale PPC SDK has the gcc libraries in
+// /usr/lib//x.y.z so have a look
there as well.
+"/" + CandidateTriple.str(),


So, this is really bad it turns out.

We use this directory to walk every installed GCC version. But
because this is just a normal lib directory on many systems
(ever Debian and Ubuntu system for example) this goes very
badly. It goes even more badly because of the (questionable)
design of LLVM's directory iterator:


Wow, this is pretty bad, but it really sounds like the iterator
should be fixed rather than trying to hack around it.


I mean, we should.

But even then, walking the entire directory seems bad too... See below.


Agreed. FWIW, it looks like LLVM's directory iterators stat lazily 
(although doing an equality comparison will cause them to stat). Is 
going through Clang's VFS layer causing the eager stating somehow?



Doesn't this happen for the other directories as well (which,
admittedly, will have less entries)?


The *only* entries in the other directories are the actual installed 
GCC toolchains though, so walking them makes a lot of sense. The 
tricky thing is that this isn't a gcc-specific directory.


I suspect the fix should be to not use this base path as part of the 
walk to discover GCC toolchains, and instead to hard code the specific 
toolchain patterns on this specific platform.


Or we could do the walk, but only when actually on the NXP/Freescale 
Power platform where this is necessary to find GCC installations.


Given that we don't have a platform on which to test right now, I think 
that this second option sounds best. Only add those directories to the 
search path when -fsl- is in the triple (or something like that).


 -Hal



Both of those would seem reasonable. Fixing the directory iterator 
would be icing on the cake IMO. =D


--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH v3] [PPC64]: Add support for Swift calling convention

2017-07-22 Thread Hal Finkel via cfe-commits


On 07/19/2017 10:26 AM, Adrian Prantl wrote:

On Jun 21, 2017, at 11:32 PM, Andrew Jeffery  wrote:

For the tests I've extracted the int5 and int8 cases to cater for
different alignments for different platform ABIs. For Linux on POWER the
5 and 8 element vectors must be naturally aligned with respect to the
total "soft" vector size, despite being represented as an aggregate.
Specifically, the patch caters for the following differences in
supporting powerpc64le-unknown-linux:

   $ diff -u test/CodeGen/64bit-swiftcall.c test/CodeGen/ppc64-swiftcall.c
   --- test/CodeGen/64bit-swiftcall.c   2017-04-20 17:14:59.797963820 +0930
   +++ test/CodeGen/ppc64-swiftcall.c   2017-04-20 17:15:11.621965118 +0930
   @@ -1,7 +1,6 @@
   -// RUN: %clang_cc1 -triple x86_64-apple-darwin10 -target-cpu core2 
-emit-llvm -o - %s | FileCheck %s
   -// RUN: %clang_cc1 -triple arm64-apple-ios9 -target-cpu cyclone -emit-llvm 
-o - %s | FileCheck %s
   +// RUN: %clang_cc1 -triple powerpc64le-unknown-linux -emit-llvm -o - %s | 
FileCheck %s

   -// REQUIRES: aarch64-registered-target,x86-registered-target
   +// REQUIRES: powerpc-registered-target

#define SWIFTCALL __attribute__((swiftcall))
#define OUT __attribute__((swift_indirect_result))
   @@ -370,8 +369,8 @@

TEST(int8)
// CHECK-LABEL: define {{.*}} @return_int8()
   -// CHECK:   [[RET:%.*]] = alloca [[REC:<8 x i32>]], align 16
   +// CHECK:   [[RET:%.*]] = alloca [[REC:<8 x i32>]], align 32
// CHECK:   [[VAR:%.*]] = alloca [[REC]], align
// CHECK:   store
// CHECK:   load
// CHECK:   store
   @@ -414,8 +413,8 @@

TEST(int5)
// CHECK-LABEL: define {{.*}} @return_int5()
   -// CHECK:   [[RET:%.*]] = alloca [[REC:<5 x i32>]], align 16
   +// CHECK:   [[RET:%.*]] = alloca [[REC:<5 x i32>]], align 32
// CHECK:   [[VAR:%.*]] = alloca [[REC]], align
// CHECK:   store
// CHECK:   load
// CHECK:   store

Despite some duplication, the advantage of this approach over using
pattern matching for alignment in 64bit-swiftcall.c is that we ensure
each platform is using the expected alignment but without duplicating
the entirety of 64bit-swiftcall.c.

You could also write all in one file and use invoke FileCheck with 
--check-prefix=CHECK-PPC64 to have a second set of CHECK-lines in the same 
input file.

-- adrian

Signed-off-by: Andrew Jeffery 
---

Hello,

The only change in v3 is rebasing it on top upstream HEAD, fixing a conflict in
one of the lit REQUIRES lines.

Ulrich, Hal, Bill: I've Cc'ed you as you were fingered by the blame output. As
some background I sent the patch several months ago but it hasn't got much
traction aside from a LGTM from Adrian (thanks!). I'm hoping it gets a bit more
attention as without it we get build failures for Swift on POWER, which is
in-turn blocking some CI efforts.

Cheers,

Andrew

lib/Basic/Targets.cpp |  11 ++
lib/CodeGen/TargetInfo.cpp|  14 ++-
test/CodeGen/64bit-swiftcall-extvec-agg-align16.c | 117 ++
test/CodeGen/64bit-swiftcall-extvec-agg-align32.c | 116 +
test/CodeGen/64bit-swiftcall.c|  93 +
5 files changed, 258 insertions(+), 93 deletions(-)
create mode 100644 test/CodeGen/64bit-swiftcall-extvec-agg-align16.c
create mode 100644 test/CodeGen/64bit-swiftcall-extvec-agg-align32.c

diff --git a/lib/Basic/Targets.cpp b/lib/Basic/Targets.cpp
index e23a93e..54b5911 100644
--- a/lib/Basic/Targets.cpp
+++ b/lib/Basic/Targets.cpp
@@ -1753,6 +1753,17 @@ public:
 }
 return false;
   }
+
+  CallingConvCheckResult checkCallingConvention(CallingConv CC) const override 
{
+switch (CC) {
+case CC_C:
+case CC_Swift:
+return CCCR_OK;
+default:
+break;
+}
+return CCCR_Warning;
+  }
};

class DarwinPPC32TargetInfo : public DarwinTargetInfo {
diff --git a/lib/CodeGen/TargetInfo.cpp b/lib/CodeGen/TargetInfo.cpp
index 8d00e05..a82cd24 100644
--- a/lib/CodeGen/TargetInfo.cpp
+++ b/lib/CodeGen/TargetInfo.cpp
@@ -4179,7 +4179,7 @@ 
PPC32TargetCodeGenInfo::initDwarfEHRegSizeTable(CodeGen::CodeGenFunction &CGF,

namespace {
/// PPC64_SVR4_ABIInfo - The 64-bit PowerPC ELF (SVR4) ABI information.
-class PPC64_SVR4_ABIInfo : public ABIInfo {
+class PPC64_SVR4_ABIInfo : public SwiftABIInfo {
public:
   enum ABIKind {
 ELFv1 = 0,
@@ -4223,7 +4223,7 @@ private:
public:
   PPC64_SVR4_ABIInfo(CodeGen::CodeGenTypes &CGT, ABIKind Kind, bool HasQPX,
  bool SoftFloatABI)
-  : ABIInfo(CGT), Kind(Kind), HasQPX(HasQPX),
+  : SwiftABIInfo(CGT), Kind(Kind), HasQPX(HasQPX),
 IsSoftFloatABI(SoftFloatABI) {}

   bool isPromotableTypeForABI(QualType Ty) const;
@@ -4266,6 +4266,16 @@ public:

   Address EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
 QualType Ty) const override;
+
+  bool shouldPassIndirectlyForSwift(CharUnits totalSize,
+ArrayRef scalars,
+   

Re: [PATCH] Warning for main returning a bool.

2016-11-15 Thread Hal Finkel via cfe-commits
- Original Message -
> From: "Aaron Ballman via cfe-commits" 
> To: "Joshua Hurwitz" 
> Cc: "cfe-commits" 
> Sent: Tuesday, November 15, 2016 12:17:28 PM
> Subject: Re: [PATCH] Warning for main returning a bool.
> 
> On Fri, Oct 14, 2016 at 1:17 PM, Joshua Hurwitz via cfe-commits
>  wrote:
> > See attached.
> >
> > Returning a bool from main is a special case of return type
> > mismatch. The
> > common convention when returning a bool is that 'true' (== 1)
> > indicates
> > success and 'false' (== 0) failure. But since main expects a return
> > value of
> > 0 on success, returning a bool is usually unintended.
> 
> I am not convinced that this is a high-value diagnostic. Returning a
> Boolean from main() may or may not be a bug (the returned value is
> generally a convention more than anything else). Also, why Boolean
> and
> not, say, long long or float?

I've seen this error often enough, but I think we need to be careful about 
false positives here. I recommend that we check only for explicit uses of 
boolean immediates (i.e. return true; or return false;) -- these are often bugs.

Aaron, I disagree that the return value is just some kind of convention. It has 
a clear meaning. Furthermore, the behavior of the system can be quite different 
for a non-zero exit code than otherwise, and users who don't understand what's 
going on can find it very difficult to understand what's going wrong.

Thanks again,
Hal

> 
> ~Aaron
> 
> >
> > ___
> > cfe-commits mailing list
> > cfe-commits@lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> >
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> 

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   >