from:"Finkel, Hal J. via cfe\-commits"

Re: Default -fomit-frame-pointer for PowerPC

2019-04-02 Thread Finkel, Hal J. via cfe-commits

On 4/2/19 12:31 PM, Roman Lebedev via cfe-commits wrote:
> Best to submit patches to phabricator https://reviews.llvm.org/


Please follow the instructions here:
http://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface
- This will make it much easier for us to track this.

Thanks!

 -Hal

>
> On Tue, Apr 2, 2019 at 8:30 PM George Koehler via cfe-commits
>  wrote:
>> Hello cfe-commits,
>>
>> The attached patch is for clang to use -fomit-frame-pointer by default
>> for all PowerPC targets when optimizing code.  Right now, clang uses
>> -fomit-frame-pointer for PowerPC Linux and NetBSD but not for other
>> targets.  I have been running `clang -target powerpc-openbsd`.
>>
>> The patch is for llvm-project.git master.  I previously posted this
>> patch to https://bugs.llvm.org/show_bug.cgi?id=41094 , but the patch
>> in this email is for a newer revision of master.
>>
>> In most functions, the frame pointer in r31 is an unnecessary extra
>> copy of the stack pointer in r1.  GCC is using -fomit-frame-pointer by
>> default (in my PowerPC machine running OpenBSD/macppc); I want Clang
>> to be at least as good as GCC.  Also, this patch helps me to compare
>> the output of `clang -target powerpc-openbsd -O2 -S` with the output
>> for Linux or NetBSD.  In bug 41094, I showed how -fomit-frame-pointer
>> simplifies the C function `void nothing(void) {}`.
>>
>> This is my first mail to cfe-commits; I'm using the instructions at
>> http://llvm.org/docs/DeveloperPolicy.html#making-and-submitting-a-patch
>>
>> --
>> George Koehler 
>> ___
>> cfe-commits mailing list
>> cfe-commits@lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

2019-05-02 Thread Finkel, Hal J. via cfe-commits

Thanks, Justin. It sees like we have the standard set of options: We can 
disallow the mismatch. We can allow it with a warning. We can allow it without 
a warning. We can say that if the mismatch contributes to the type of a kernel 
function, that's illformed (NDR).

In any case, it seems like your examples argue for disallowing a return-type 
mismatch between host and device overloads, not disallowing observing the type? 
Or maybe disallowing observing the type only when there's a mismatch?

 -Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

From: Justin Lebar 
Sent: Thursday, May 2, 2019 9:16 PM
To: reviews+d61458+public+f6ea501465ad5...@reviews.llvm.org
Cc: michael.hl...@gmail.com; Artem Belevich; John McCall; Liu, Yaxun (Sam); 
Finkel, Hal J.; Richard Smith; Clang Commits; mlek...@skidmore.edu; 
blitzrak...@gmail.com; Han Shen
Subject: Re: [PATCH] D61458: [hip] Relax CUDA call restriction within 
`decltype` context.

> So, actually, I wonder if that's not the right answer. We generally allow 
> different overloads to have different return types. What if, for example, the 
> return type on the host is __float128 and on the device it's `MyLongFloatTy`?

The problem is that conceptually compiling for host/device does not create a 
new set of overloads.

When we compile for (say) host, we build a full AST for all functions, 
including device functions, and that AST must pass sema checks.  This is 
significant for example because when compiling for device we need to know which 
kernel templates were instantiated on the host side, so we know which kernels 
to emit.

Here's a contrived example.

```
 __host__ int8 bar();
__device__ int16 bar();
__host__ __device__ auto foo() -> decltype(bar()) {}

template  __global__ kernel();

void launch_kernel() {
  kernel<<<...>>>();
}
```

This template instantiation had better be the same when compiling for host and 
device.

That's contrived, but consider this much simpler case:

```
void host_fn() {
  static_assert(sizeof(decltype(foo())) == sizeof(int8));
}
```

If we let foo return int16 in device mode, this static_assert will fail when 
compiling in *device* mode even though host_fn is never called on the device.  
https://gcc.godbolt.org/z/gYq901

Why are we doing sema checks on the host code when compiling for device?  See 
contrived example above, we need quite a bit of info about the host code to 
infer those templates.

On Thu, May 2, 2019 at 7:05 PM Hal Finkel via Phabricator 
mailto:revi...@reviews.llvm.org>> wrote:
hfinkel added a comment.

In D61458#1488970 , @jlebar wrote:

> Here's one for you:
>
>   __host__ float bar();
>   __device__ int bar();
>   __host__ __device__ auto foo() -> decltype(bar()) {}
>
>
> What is the return type of `foo`?  :)
>
> I don't believe the right answer is, "float when compiling for host, int when 
> compiling for device."

So, actually, I wonder if that's not the right answer. We generally allow 
different overloads to have different return types. What if, for example, the 
return type on the host is __float128 and on the device it's `MyLongFloatTy`?

> I'd be happy if we said this was an error, so long as it's well-defined what 
> exactly we're disallowing.  But I bet @rsmith can come up with substantially 
> more evil testcases than this.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61458/new/

https://reviews.llvm.org/D61458

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: r374288 - Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)"

2019-10-21 Thread Finkel, Hal J. via cfe-commits

On 10/21/19 4:00 PM, Jordan Rupprecht via cfe-commits wrote:
There's also a curious failure caused by this patch (confirmed passing 
@r374287, failing @r374288):

$ cat /tmp/vectorize.cc
void a() {
#pragma clang loop vectorize(disable)
  for (;;)
;
}

$ clang++ -Werror -O3 -c /tmp/vectorize.cc
/tmp/vectorize.cc:1:6: error: loop not interleaved: the optimizer was unable to 
perform the requested transformation; the transformation might be disabled or 
specified as part of an unsupported transformation ordering 
[-Werror,-Wpass-failed=transform-warning]
void a() {

I don't understand this warning -- there is no requested transformation; in 
fact, we've explicitly specified that vectorization *should* be disabled.


Oh. That's funny.

So the problem is that the warning is just responding to metadata that the 
vectorizer didn't remove, it doesn't understand anything about what the 
metadata is requesting. WarnMissedTransforms.cpp should probably be updated to 
avoid this issue.


On Mon, Oct 21, 2019 at 9:18 AM Hans Wennborg via cfe-commits 
mailto:cfe-commits@lists.llvm.org>> wrote:
As expected, this broke the Chromium build again (but it seems only at
-Oz this time).


To be clear, is the desired behavior that the vectorization is performed even 
at -Oz? I suspect that it is, but I just want to confirm.

 -Hal


I'm still not a big fan of the warning: the #pragma tells the compiler
to vectorize, and then vectorization doesn't happen -- that sounds
like a compiler bug to me, and instead of pushing the problem on the
developer, I wish that could have been fixed before this landed.

We'll probably fix our build by removing the use of the pragma, but
for others who will be broken by this, I think it would be good to
have a release note about the change in behaviour (even though it was
always the intended behaviour), and especially how developers are
supposed to deal with it.

Thanks,
hans

On Thu, Oct 10, 2019 at 1:24 AM Sjoerd Meijer via cfe-commits
mailto:cfe-commits@lists.llvm.org>> wrote:
>
> Author: sjoerdmeijer
> Date: Thu Oct 10 01:27:14 2019
> New Revision: 374288
>
> URL: http://llvm.org/viewvc/llvm-project?rev=374288&view=rev
> Log:
> Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)"
>
> This was further discussed at the llvm dev list:
>
> http://lists.llvm.org/pipermail/llvm-dev/2019-October/135602.html
>
> I think the brief summary of that is that this change is an improvement,
> this is the behaviour that we expect and promise in ours docs, and also
> as a result there are cases where we now emit diagnostics whereas before
> pragmas were silently ignored. Two areas where we can improve: 1) the
> diagnostic message itself, and 2) and in some cases (e.g. -Os and -Oz)
> the vectoriser is (quite understandably) not triggering.
>
> Original commit message:
>
> Specifying the vectorization width was supposed to implicitly enable
> vectorization, except that it wasn't really doing this. It was only
> setting the vectorize.width metadata, but not vectorize.enable.
>
> This should fix PR27643.
>
> Modified:
> cfe/trunk/lib/CodeGen/CGLoopInfo.cpp
> cfe/trunk/test/CodeGenCXX/pragma-loop-predicate.cpp
> cfe/trunk/test/CodeGenCXX/pragma-loop.cpp
>
> Modified: cfe/trunk/lib/CodeGen/CGLoopInfo.cpp
> URL: 
> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGLoopInfo.cpp?rev=374288&r1=374287&r2=374288&view=diff
> ==
> --- cfe/trunk/lib/CodeGen/CGLoopInfo.cpp (original)
> +++ cfe/trunk/lib/CodeGen/CGLoopInfo.cpp Thu Oct 10 01:27:14 2019
> @@ -270,6 +270,14 @@ LoopInfo::createLoopVectorizeMetadata(co
>
>// Setting vectorize.width
>if (Attrs.VectorizeWidth > 0) {
> +// This implies vectorize.enable = true, but only add it when it is not
> +// already enabled.
> +if (Attrs.VectorizeEnable != LoopAttributes::Enable)
> +  Args.push_back(
> +  MDNode::get(Ctx, {MDString::get(Ctx, "llvm.loop.vectorize.enable"),
> +ConstantAsMetadata::get(ConstantInt::get(
> +llvm::Type::getInt1Ty(Ctx), 1))}));
> +
>  Metadata *Vals[] = {
>  MDString::get(Ctx, "llvm.loop.vectorize.width"),
>  ConstantAsMetadata::get(ConstantInt::get(llvm::Type::getInt32Ty(Ctx),
>
> Modified: cfe/trunk/test/CodeGenCXX/pragma-loop-predicate.cpp
> URL: 
> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCXX/pragma-loop-predicate.cpp?rev=374288&r1=374287&r2=374288&view=diff
> ==
> --- cfe/trunk/test/CodeGenCXX/pragma-loop-predicate.cpp (original)
> +++ cfe/trunk/test/CodeGenCXX/pragma-loop-predicate.cpp Thu Oct 10 01:27:14 
> 2019
> @@ -58,7 +58,6 @@ void test5(int *List, int Length) {
>  List[i] = i * 2;
>  }
>
> -
>  // CHECK:  ![[LOOP0]] = distinct !{![[LOOP0]], !3}
>  // CHECK-NEXT: !3 = !{!"llvm.loop.vectorize.enable", i1 true}
>
> @@ -

Re: Default -fomit-frame-pointer for PowerPC

Re: [PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

Re: r374288 - Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)"

3 matches

Site Navigation

Mail list logo

Footer information