[clang] [NVPTX] Add a clang builtin for the `warpsize` intrinsic (PR #110316)

2024-09-27 Thread Justin Lebar via cfe-commits
https://github.com/jlebar approved this pull request. sgtm https://github.com/llvm/llvm-project/pull/110316 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Remove nvvm.bitcast.* intrinsics (PR #107936)

2024-09-23 Thread Justin Lebar via cfe-commits
@@ -599,14 +599,6 @@ TARGET_BUILTIN(__nvvm_e4m3x2_to_f16x2_rn_relu, "V2hs", "", AND(SM_89,PTX81)) TARGET_BUILTIN(__nvvm_e5m2x2_to_f16x2_rn, "V2hs", "", AND(SM_89,PTX81)) TARGET_BUILTIN(__nvvm_e5m2x2_to_f16x2_rn_relu, "V2hs", "", AND(SM_89,PTX81)) -// Bitcast

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Justin Lebar via cfe-commits
jlebar wrote: @Artem-B https://github.com/llvm/llvm-project/pull/96561 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-12 Thread Justin Lebar via cfe-commits
jlebar wrote: @Artem-B https://github.com/llvm/llvm-project/pull/96015 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Support inline asm with 128-bit operand in NVPTX backend (PR #97113)

2024-06-28 Thread Justin Lebar via cfe-commits
https://github.com/jlebar approved this pull request. https://github.com/llvm/llvm-project/pull/97113 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Support inline asm with 128-bit operand in NVPTX backend (PR #97113)

2024-06-28 Thread Justin Lebar via cfe-commits
jlebar wrote: > Which file should I modify? Use `git grep` to find where the text from that section of the langref lives? https://github.com/llvm/llvm-project/pull/97113 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cg

[clang] [llvm] [NVPTX] Support inline asm with 128-bit operand in NVPTX backend (PR #97113)

2024-06-28 Thread Justin Lebar via cfe-commits
https://github.com/jlebar commented: LGTM other than the previous comment. https://github.com/llvm/llvm-project/pull/97113 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Support inline asm with 128-bit operand in NVPTX backend (PR #97113)

2024-06-28 Thread Justin Lebar via cfe-commits
https://github.com/jlebar requested changes to this pull request. This needs to be documented in the langref in this section, right? https://llvm.org/docs/LangRef.html#supported-constraint-code-list https://github.com/llvm/llvm-project/pull/97113 ___

[clang] [Clang] Introduce 'clang-nvlink-wrappaer' to work around 'nvlink' (PR #96561)

2024-06-24 Thread Justin Lebar via cfe-commits
jlebar wrote: @Artem-B asked me to review nvptx patches while he's OOO, but this one is pretty far outside my depth. Are you OK waiting until he's back? I don't know exactly when that will be, but based on his IMs to me, he should be back early July. https://github.com/llvm/llvm-project/pul

[clang] [clang][AMDGPU][CUDA] Handle __builtin_printf for device printf (PR #68515)

2024-02-03 Thread Justin Lebar via cfe-commits
jlebar wrote: It looks reasonable to me, although I'm not really an AMDGPU person. /me summons @arsenm ? https://github.com/llvm/llvm-project/pull/68515 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/li

[clang] [CUDA] Change '__activemask' to use '__nvvm_activemask()' (PR #79892)

2024-01-29 Thread Justin Lebar via cfe-commits
https://github.com/jlebar approved this pull request. https://github.com/llvm/llvm-project/pull/79892 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Justin Lebar via cfe-commits
https://github.com/jlebar approved this pull request. https://github.com/llvm/llvm-project/pull/79768 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Justin Lebar via cfe-commits
jlebar wrote: > I was planning on updating this to use the new instrinsic for the newer > version. Alternatively we could make __activemask the builtin which expands > to both versions, but I'm somewhat averse since we should target the > instruction directly I feel. Yes, I agree that the bui

[clang] [NVPTX] Allow compiling LLVM-IR without `-march` set (PR #79873)

2024-01-29 Thread Justin Lebar via cfe-commits
https://github.com/jlebar approved this pull request. https://github.com/llvm/llvm-project/pull/79873 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Justin Lebar via cfe-commits
jlebar wrote: Unlike the other PRs, this one has a CUDA function, `__activemask()`. Presumably we should make that one work by hacking our headers? https://github.com/llvm/llvm-project/pull/79768 ___ cfe-commits mailing list cfe-commits@lists.llvm.or

[clang] [llvm] [NVPTX] Add builtin for 'exit' handling (PR #79777)

2024-01-29 Thread Justin Lebar via cfe-commits
https://github.com/jlebar approved this pull request. https://github.com/llvm/llvm-project/pull/79777 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [NVPTX] Add builtin support for 'globaltimer' (PR #79765)

2024-01-29 Thread Justin Lebar via cfe-commits
https://github.com/jlebar approved this pull request. https://github.com/llvm/llvm-project/pull/79765 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [NVPTX] Add builtin support for 'globaltimer' (PR #79765)

2024-01-29 Thread Justin Lebar via cfe-commits
https://github.com/jlebar edited https://github.com/llvm/llvm-project/pull/79765 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[flang] [libc] [compiler-rt] [clang] [clang-tools-extra] [llvm] [libcxx] [lld] [lldb] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Justin Lebar via cfe-commits
jlebar wrote: Got it, okay, thanks. Since this change only applies to `--target=nvptx64-nvidia-cuda`, fine by me. Thanks for putting up with our scrutiny. :) https://github.com/llvm/llvm-project/pull/79373 ___ cfe-commits mailing list cfe-commits@l

[lld] [lldb] [libcxx] [compiler-rt] [clang-tools-extra] [llvm] [libc] [clang] [flang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Justin Lebar via cfe-commits
jlebar wrote: I...think I understand. Is the output of this compilation step a cubin, then? https://github.com/llvm/llvm-project/pull/79373 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-com

[lld] [lldb] [clang-tools-extra] [clang] [libcxx] [libc] [flang] [llvm] [compiler-rt] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Justin Lebar via cfe-commits
jlebar wrote: > This method of compilation is not like CUDA, so we can't target all the GPUs > at the same time. Can you clarify for me -- what are you compiling where it's impossible to target multiple GPUs in the binary? I'm confused because Art is understanding that it's not CUDA, but we

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-24 Thread Justin Lebar via cfe-commits
jlebar wrote: I think I'm with Art on this one. >> Problem #2 [...] The arch=native will create a working configuration, but >> would build more than necessary. > > It will target the first GPU it finds. We could maybe change the behavior to > detect the newest, but the idea is just to target

[clang] c2f501f - [CUDA][SPIRV] Assign global address space to CUDA kernel arguments

2022-02-24 Thread Justin Lebar via cfe-commits
Author: Shangwu Yao Date: 2022-02-24T20:51:43-08:00 New Revision: c2f501f39589a59db9cebc839d0a63dcdc3c5c81 URL: https://github.com/llvm/llvm-project/commit/c2f501f39589a59db9cebc839d0a63dcdc3c5c81 DIFF: https://github.com/llvm/llvm-project/commit/c2f501f39589a59db9cebc839d0a63dcdc3c5c81.diff L

[clang] 9de4fc0 - [CUDA][SPIRV] Assign global address space to CUDA kernel arguments

2022-02-17 Thread Justin Lebar via cfe-commits
Author: Shangwu Yao Date: 2022-02-17T09:38:06-08:00 New Revision: 9de4fc0f2d3b60542956f7e5254951d049edeb1f URL: https://github.com/llvm/llvm-project/commit/9de4fc0f2d3b60542956f7e5254951d049edeb1f DIFF: https://github.com/llvm/llvm-project/commit/9de4fc0f2d3b60542956f7e5254951d049edeb1f.diff L

[clang] 6eb8265 - [Driver] Add CUDA support for --offload param

2022-01-28 Thread Justin Lebar via cfe-commits
Author: Daniele Castagna Date: 2022-01-28T14:50:39-08:00 New Revision: 6eb826567af03c2b43cda78836b1065e12df84e4 URL: https://github.com/llvm/llvm-project/commit/6eb826567af03c2b43cda78836b1065e12df84e4 DIFF: https://github.com/llvm/llvm-project/commit/6eb826567af03c2b43cda78836b1065e12df84e4.di

[clang] 7dd6068 - [clang-rename] Handle designated initializers.

2021-04-12 Thread Justin Lebar via cfe-commits
Author: Daniele Castagna Date: 2021-04-12T13:15:14-07:00 New Revision: 7dd60688992526bb7ee0c7846e9abd591fc3e297 URL: https://github.com/llvm/llvm-project/commit/7dd60688992526bb7ee0c7846e9abd591fc3e297 DIFF: https://github.com/llvm/llvm-project/commit/7dd60688992526bb7ee0c7846e9abd591fc3e297.di

Re: [PATCH] D100310: Add field designated initializers logic in Tooling/Rename

2021-04-12 Thread Justin Lebar via cfe-commits
I guess you need me or Michael to push this. Happy to do so once you're happy with it. On Mon, Apr 12, 2021 at 11:33 AM Daniele Castagna via Phabricator < revi...@reviews.llvm.org> wrote: > dcastagna updated this revision to Diff 336912. > dcastagna added a comment. > > clang-format again > > >

[clang] e890fff - Fix signed-compare warning.

2021-02-25 Thread Justin Lebar via cfe-commits
Author: Justin Lebar Date: 2021-02-25T18:14:40-08:00 New Revision: e890fffcab8b7e95deba4269c14db9fab003a737 URL: https://github.com/llvm/llvm-project/commit/e890fffcab8b7e95deba4269c14db9fab003a737 DIFF: https://github.com/llvm/llvm-project/commit/e890fffcab8b7e95deba4269c14db9fab003a737.diff

[clang] c90dac2 - [clang] Print 32 candidates on the first failure, with -fshow-overloads=best.

2021-02-25 Thread Justin Lebar via cfe-commits
Author: Justin Lebar Date: 2021-02-25T17:45:19-08:00 New Revision: c90dac27e94ec354a3e8919556ac5bc89b62c731 URL: https://github.com/llvm/llvm-project/commit/c90dac27e94ec354a3e8919556ac5bc89b62c731 DIFF: https://github.com/llvm/llvm-project/commit/c90dac27e94ec354a3e8919556ac5bc89b62c731.diff

Re: [PATCH] D75811: [CUDA] Choose default architecture based on CUDA installation

2020-03-09 Thread Justin Lebar via cfe-commits
From the peanut gallery: Perhaps something like --cuda_arch=min_supported would solve your problem while still meeting tra's request not to change behavior of the compiler based on something external. On Mon, Mar 9, 2020 at 12:58 PM Raul Tambre via Phabricator wrote: > > tambre added a comment. >

[clang] ac66c61 - Use C++14-style return type deduction in clang.

2020-02-11 Thread Justin Lebar via cfe-commits
Author: Justin Lebar Date: 2020-02-11T14:41:22-08:00 New Revision: ac66c61bf9463bf419102ad8b6565dcbc495b0ab URL: https://github.com/llvm/llvm-project/commit/ac66c61bf9463bf419102ad8b6565dcbc495b0ab DIFF: https://github.com/llvm/llvm-project/commit/ac66c61bf9463bf419102ad8b6565dcbc495b0ab.diff

[clang] f0fd852 - Fix SFINAE in CFG.cpp.

2020-02-11 Thread Justin Lebar via cfe-commits
Author: Justin Lebar Date: 2020-02-11T10:37:08-08:00 New Revision: f0fd852fcd054297f2b07e2ca87551de9b2a39c0 URL: https://github.com/llvm/llvm-project/commit/f0fd852fcd054297f2b07e2ca87551de9b2a39c0 DIFF: https://github.com/llvm/llvm-project/commit/f0fd852fcd054297f2b07e2ca87551de9b2a39c0.diff

[clang] 027eb71 - Use std::foo_t rather than std::foo in clang.

2020-02-11 Thread Justin Lebar via cfe-commits
Author: Justin Lebar Date: 2020-02-11T10:37:08-08:00 New Revision: 027eb71696f6ce4fdeb63f68c8c6b66e147ad407 URL: https://github.com/llvm/llvm-project/commit/027eb71696f6ce4fdeb63f68c8c6b66e147ad407 DIFF: https://github.com/llvm/llvm-project/commit/027eb71696f6ce4fdeb63f68c8c6b66e147ad407.diff

Re: [PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

2019-05-02 Thread Justin Lebar via cfe-commits
> In any case, it seems like your examples argue for disallowing a return-type mismatch between host and device overloads, not disallowing observing the type? Oh no, we have to allow return-type mismatches between host and device overloads, that is a common thing in CUDA code I've seen. You can s

Re: [PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

2019-05-02 Thread Justin Lebar via cfe-commits
> So, actually, I wonder if that's not the right answer. We generally allow different overloads to have different return types. What if, for example, the return type on the host is __float128 and on the device it's `MyLongFloatTy`? The problem is that conceptually compiling for host/device does no

Re: [PATCH] D55456: [CUDA] added missing 'inline' for the functions defined in the header.

2018-12-07 Thread Justin Lebar via cfe-commits
Lgtm On Fri, Dec 7, 2018, 1:12 PM Artem Belevich via Phabricator < revi...@reviews.llvm.org> wrote: > tra created this revision. > tra added a reviewer: jlebar. > Herald added subscribers: bixia, sanjoy. > > https://reviews.llvm.org/D55456 > > Files: > clang/lib/Headers/cuda_wrappers/new > > >

r336026 - [CUDA] Make __host__/__device__ min/max overloads constexpr in C++14.

2018-06-29 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Fri Jun 29 15:28:09 2018 New Revision: 336026 URL: http://llvm.org/viewvc/llvm-project?rev=336026&view=rev Log: [CUDA] Make __host__/__device__ min/max overloads constexpr in C++14. Summary: Tests in a separate change to the test-suite. Reviewers: rsmith, tra Subscribers: l

r336025 - [CUDA] Make min/max shims host+device.

2018-06-29 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Fri Jun 29 15:27:56 2018 New Revision: 336025 URL: http://llvm.org/viewvc/llvm-project?rev=336025&view=rev Log: [CUDA] Make min/max shims host+device. Summary: Fixes PR37753: min/max can't be called from __host__ __device__ functions in C++14 mode. Testcase in a separate tes

r332621 - [CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces.

2018-05-17 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu May 17 09:15:07 2018 New Revision: 332621 URL: http://llvm.org/viewvc/llvm-project?rev=332621&view=rev Log: [CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces. Summary: Previously this triggered a -Wundefined-internal warning. But it's not an undefined va

r332619 - [CUDA] Make std::min/max work when compiling in C++14 mode with a C++11 stdlib.

2018-05-17 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu May 17 09:12:42 2018 New Revision: 332619 URL: http://llvm.org/viewvc/llvm-project?rev=332619&view=rev Log: [CUDA] Make std::min/max work when compiling in C++14 mode with a C++11 stdlib. Reviewers: rsmith Subscribers: sanjoy, cfe-commits, tra Differential Revision: htt

r318494 - [CUDA] Remove implementations of nexttoward.

2017-11-16 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Nov 16 17:15:43 2017 New Revision: 318494 URL: http://llvm.org/viewvc/llvm-project?rev=318494&view=rev Log: [CUDA] Remove implementations of nexttoward. Summary: __builtin_nexttoward lowers to a libcall, e.g. nexttowardf(), that CUDA does not have. Rather than try to imp

r317961 - [CUDA] Fix std::min on device side to return the min, not the max.

2017-11-10 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Fri Nov 10 17:25:44 2017 New Revision: 317961 URL: http://llvm.org/viewvc/llvm-project?rev=317961&view=rev Log: [CUDA] Fix std::min on device side to return the min, not the max. Summary: How embarrassing. This is tested in the test-suite -- fix to come there in a separate p

r317623 - [NVPTX] Implement __nvvm_atom_add_gen_d builtin.

2017-11-07 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Tue Nov 7 14:10:54 2017 New Revision: 317623 URL: http://llvm.org/viewvc/llvm-project?rev=317623&view=rev Log: [NVPTX] Implement __nvvm_atom_add_gen_d builtin. Summary: This just seems to have been an oversight. We already supported the f64 atomic add with an explicit scope

r317297 - [CUDA] Mark CUDA as a no-errno platform.

2017-11-02 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Nov 2 19:30:00 2017 New Revision: 317297 URL: http://llvm.org/viewvc/llvm-project?rev=317297&view=rev Log: [CUDA] Mark CUDA as a no-errno platform. Summary: CUDA doesn't support errno at all, so this is the right thing -- or at least, in the right direction. But also, t

r316611 - [CUDA] Print an error if you try to compile with < sm_30 on CUDA 9.

2017-10-25 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Oct 25 14:32:06 2017 New Revision: 316611 URL: http://llvm.org/viewvc/llvm-project?rev=316611&view=rev Log: [CUDA] Print an error if you try to compile with < sm_30 on CUDA 9. Summary: CUDA 9's minimum sm is sm_30. Ideally we should also make sm_30 the default when compi

r314142 - Revert "[NVPTX] added match.{any, all}.sync instructions, intrinsics & builtins.", rL314135.

2017-09-25 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Mon Sep 25 12:41:56 2017 New Revision: 314142 URL: http://llvm.org/viewvc/llvm-project?rev=314142&view=rev Log: Revert "[NVPTX] added match.{any,all}.sync instructions, intrinsics & builtins.", rL314135. Causing assertion failures on macos: > Assertion failed: (Num < NumOpe

r312736 - [CUDA] When compilation fails, print the compilation mode.

2017-09-07 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Sep 7 11:37:16 2017 New Revision: 312736 URL: http://llvm.org/viewvc/llvm-project?rev=312736&view=rev Log: [CUDA] When compilation fails, print the compilation mode. Summary: That is, instead of "1 error generated", we now say "1 error generated when compiling for sm_35"

r312681 - [CUDA] Add device overloads for non-placement new/delete.

2017-09-06 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Sep 6 17:37:20 2017 New Revision: 312681 URL: http://llvm.org/viewvc/llvm-project?rev=312681&view=rev Log: [CUDA] Add device overloads for non-placement new/delete. Summary: Tests have to live in the test-suite, and so will come in a separate patch. Fixes PR34360. Revi

Re: r261774 - Bail on compilation as soon as a job fails.

2017-05-15 Thread Justin Lebar via cfe-commits
;> >>> >>> We had a test, but this commit changed that as well (I suppose it could >>> have been better documented). >>> >>> How easily could this be restricted to only affect CUDA jobs? >> >> >> If this gets reverted, the clang-cl PCH code

[libcxx] r301132 - Add missing acquire_load to call_once overload.

2017-04-23 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Sun Apr 23 11:58:48 2017 New Revision: 301132 URL: http://llvm.org/viewvc/llvm-project?rev=301132&view=rev Log: Add missing acquire_load to call_once overload. Summary: Seemed to have been overlooked in D24028. This bug was found and brought to my attention by Paul Wankadia.

r295609 - [CUDA] Don't pass -stack-protector to NVPTX compilations.

2017-02-19 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Sun Feb 19 13:05:32 2017 New Revision: 295609 URL: http://llvm.org/viewvc/llvm-project?rev=295609&view=rev Log: [CUDA] Don't pass -stack-protector to NVPTX compilations. We can't support stack-protector on NVPTX because NVPTX doesn't expose a stack to the compiler! Fixes PR3

r293097 - [CodeGen] [CUDA] Add the ability set default attrs on functions in linked modules.

2017-01-25 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Jan 25 15:29:48 2017 New Revision: 293097 URL: http://llvm.org/viewvc/llvm-project?rev=293097&view=rev Log: [CodeGen] [CUDA] Add the ability set default attrs on functions in linked modules. Summary: Now when you ask clang to link in a bitcode module, you can tell it to

r292694 - [NVPTX] Auto-upgrade some NVPTX intrinsics to LLVM target-generic code.

2017-01-20 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Fri Jan 20 19:00:32 2017 New Revision: 292694 URL: http://llvm.org/viewvc/llvm-project?rev=292694&view=rev Log: [NVPTX] Auto-upgrade some NVPTX intrinsics to LLVM target-generic code. Summary: Specifically, we upgrade llvm.nvvm.: * brev{32,64} * clz.{i,ll} * popc.{i,ll}

Re: r291131 - [Driver] Driver changes to support CUDA compilation on Windows.

2017-01-06 Thread Justin Lebar via cfe-commits
e intended match here > clang.EXE: error: cannot find libdevice for sm_60. Provide path to different > CUDA installation via --cuda-path, or pass -nocudalib to build without > linking with libdevice. >^ > > On Thu, Jan 5, 2017 at 8:52 AM, Justin Lebar via cfe-commi

Re: [PATCH] D28320: [Driver] Driver changes to support CUDA compilation on Windows.

2017-01-06 Thread Justin Lebar via cfe-commits
That test should be updated to explicitly specify the triple, that should also fix the problem. I'll spin that change as soon as I can. I agree that using the triple to determine the expected directory layout is kind of bogus. I have no idea if cross-compiling CUDA is even going to work... Do w

r291138 - [CUDA] Rename keywords used in macro so they don't conflict with MSVC.

2017-01-05 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jan 5 10:54:11 2017 New Revision: 291138 URL: http://llvm.org/viewvc/llvm-project?rev=291138&view=rev Log: [CUDA] Rename keywords used in macro so they don't conflict with MSVC. Summary: MSVC seems to use "__in" and "__out" for its own purposes, so we have to pick differ

r291137 - [CUDA] Don't define functions that the CUDA headers themselves define on Windows.

2017-01-05 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jan 5 10:53:55 2017 New Revision: 291137 URL: http://llvm.org/viewvc/llvm-project?rev=291137&view=rev Log: [CUDA] Don't define functions that the CUDA headers themselves define on Windows. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.

r291136 - [CUDA] Let NVPTX inherit the host's calling conventions.

2017-01-05 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jan 5 10:53:38 2017 New Revision: 291136 URL: http://llvm.org/viewvc/llvm-project?rev=291136&view=rev Log: [CUDA] Let NVPTX inherit the host's calling conventions. Summary: When compiling device code, we may still see host code with explicit calling conventions. NVPTX n

r291135 - [CUDA] More correctly inherit primitive types from the host during device compilation.

2017-01-05 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jan 5 10:53:21 2017 New Revision: 291135 URL: http://llvm.org/viewvc/llvm-project?rev=291135&view=rev Log: [CUDA] More correctly inherit primitive types from the host during device compilation. Summary: CUDA lets users share structs between the host and device, so for t

r291134 - [CUDA] Add __declspec spellings for CUDA attributes.

2017-01-05 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jan 5 10:53:04 2017 New Revision: 291134 URL: http://llvm.org/viewvc/llvm-project?rev=291134&view=rev Log: [CUDA] Add __declspec spellings for CUDA attributes. Summary: CUDA attributes are spelled __declspec(__foo__) on Windows. Reviewers: tra Subscribers: cfe-commits,

r291133 - [ToolChains] Use "static" instead of an anonymous namespace for a function. NFC

2017-01-05 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jan 5 10:52:47 2017 New Revision: 291133 URL: http://llvm.org/viewvc/llvm-project?rev=291133&view=rev Log: [ToolChains] Use "static" instead of an anonymous namespace for a function. NFC Modified: cfe/trunk/lib/Driver/MinGWToolChain.cpp Modified: cfe/trunk/lib/Driv

r291131 - [Driver] Driver changes to support CUDA compilation on Windows.

2017-01-05 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jan 5 10:52:29 2017 New Revision: 291131 URL: http://llvm.org/viewvc/llvm-project?rev=291131&view=rev Log: [Driver] Driver changes to support CUDA compilation on Windows. Summary: For the most part this is straightforward: Just add a CudaInstallation object to the MSVC a

r291130 - [CUDA] Make CUDAInstallationDetector take the host triple in its constructor.

2017-01-05 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jan 5 10:52:11 2017 New Revision: 291130 URL: http://llvm.org/viewvc/llvm-project?rev=291130&view=rev Log: [CUDA] Make CUDAInstallationDetector take the host triple in its constructor. Summary: Previously it was taking the true target triple, which is not really what it

r291129 - [TableGen] Only normalize the spelling of GNU-style attributes.

2017-01-05 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jan 5 10:51:54 2017 New Revision: 291129 URL: http://llvm.org/viewvc/llvm-project?rev=291129&view=rev Log: [TableGen] Only normalize the spelling of GNU-style attributes. Summary: When Sema looks up an attribute name, it strips off leading and trailing "__" if the attrib

r291128 - [Windows] Remove functions in intrin.h that are defined in Builtin.def.

2017-01-05 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Jan 5 10:51:37 2017 New Revision: 291128 URL: http://llvm.org/viewvc/llvm-project?rev=291128&view=rev Log: [Windows] Remove functions in intrin.h that are defined in Builtin.def. Summary: These duplicate declarations cause a problem for CUDA compiles on Windows. All imp

r290717 - [ADT] Delete RefCountedBaseVPTR.

2016-12-29 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Dec 29 13:59:26 2016 New Revision: 290717 URL: http://llvm.org/viewvc/llvm-project?rev=290717&view=rev Log: [ADT] Delete RefCountedBaseVPTR. Summary: This class is unnecessary. Its comment indicated that it was a compile error to allocate an instance of a class that inhe

r289847 - [CUDA] Add --ptxas-path= flag.

2016-12-15 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Dec 15 12:44:57 2016 New Revision: 289847 URL: http://llvm.org/viewvc/llvm-project?rev=289847&view=rev Log: [CUDA] Add --ptxas-path= flag. Summary: This lets you build with one CUDA installation but use ptxas from another install. This is useful e.g. if you want to avoid

[clang-tools-extra] r289637 - [clang-tidy] Suggest including if necessary in type-promotion-in-math-fn-check.

2016-12-13 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Wed Dec 14 00:52:23 2016 New Revision: 289637 URL: http://llvm.org/viewvc/llvm-project?rev=289637&view=rev Log: [clang-tidy] Suggest including if necessary in type-promotion-in-math-fn-check. Reviewers: alexfh Subscribers: JDevlieghere, cfe-commits Differential Revision:

[clang-tools-extra] r289627 - [ClangTidy] Add new performance-type-promotion-in-math-fn check.

2016-12-13 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Tue Dec 13 21:15:01 2016 New Revision: 289627 URL: http://llvm.org/viewvc/llvm-project?rev=289627&view=rev Log: [ClangTidy] Add new performance-type-promotion-in-math-fn check. Summary: This checks for calls to double-precision math.h with single-precision arguments. For exa

r287292 - [CUDA] Attempt to fix test failures in cuda-macos-includes.cu.

2016-11-17 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Nov 17 19:11:32 2016 New Revision: 287292 URL: http://llvm.org/viewvc/llvm-project?rev=287292&view=rev Log: [CUDA] Attempt to fix test failures in cuda-macos-includes.cu. Run clang -cc1 -E instead of -S, in an attempt to make this test work cross-platform. Modified:

[PATCH] D26780: [CUDA] Wrapper header changes necessary to support MacOS.

2016-11-17 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL287288: [CUDA] Wrapper header changes necessary to support MacOS. (authored by jlebar). Changed prior to commit: https://reviews.llvm.org/D26780?vs=78298&id=78438#toc Repository: rL LLVM https://rev

[PATCH] D26776: [CUDA] Initialize our header search using the host triple.

2016-11-17 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL287286: [CUDA] Initialize our header search using the host triple. (authored by jlebar). Changed prior to commit: https://reviews.llvm.org/D26776?vs=78290&id=78436#toc Repository: rL LLVM https://re

[PATCH] D26777: [CUDA] Use the right section and constant names for fatbins when compiling for macos.

2016-11-17 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL287287: [CUDA] Use the right section and constant names for fatbins when compiling for… (authored by jlebar). Changed prior to commit: https://reviews.llvm.org/D26777?vs=78291&id=78437#toc Repository:

r287288 - [CUDA] Wrapper header changes necessary to support MacOS.

2016-11-17 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Nov 17 18:41:35 2016 New Revision: 287288 URL: http://llvm.org/viewvc/llvm-project?rev=287288&view=rev Log: [CUDA] Wrapper header changes necessary to support MacOS. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D26780 Modified

r287285 - [CUDA] Driver changes to support CUDA compilation on MacOS.

2016-11-17 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Nov 17 18:41:22 2016 New Revision: 287285 URL: http://llvm.org/viewvc/llvm-project?rev=287285&view=rev Log: [CUDA] Driver changes to support CUDA compilation on MacOS. Summary: Compiling CUDA device code requires us to know the host toolchain, because CUDA device-side com

r287287 - [CUDA] Use the right section and constant names for fatbins when compiling for macos.

2016-11-17 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Nov 17 18:41:31 2016 New Revision: 287287 URL: http://llvm.org/viewvc/llvm-project?rev=287287&view=rev Log: [CUDA] Use the right section and constant names for fatbins when compiling for macos. Reviewers: tra Subscribers: cfe-commits Differential Revision: https://revi

[PATCH] D26774: [CUDA] Driver changes to support CUDA compilation on MacOS.

2016-11-17 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. jlebar marked 2 inline comments as done. Closed by commit rL287285: [CUDA] Driver changes to support CUDA compilation on MacOS. (authored by jlebar). Changed prior to commit: https://reviews.llvm.org/D26774?vs=78286&id=78

r287286 - [CUDA] Initialize our header search using the host triple.

2016-11-17 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Thu Nov 17 18:41:27 2016 New Revision: 287286 URL: http://llvm.org/viewvc/llvm-project?rev=287286&view=rev Log: [CUDA] Initialize our header search using the host triple. Summary: This used to work because system headers are found in a (somewhat) predictable set of locations

[PATCH] D26774: [CUDA] Driver changes to support CUDA compilation on MacOS.

2016-11-17 Thread Justin Lebar via cfe-commits
jlebar marked 2 inline comments as done. jlebar added inline comments. Comment at: clang/lib/Driver/Driver.cpp:3650-3654 + + // Intentionally omitted from the switch above: llvm::Triple::CUDA. CUDA + // compiles always need two toolchains, the CUDA toolchain and the host + //

[PATCH] D26774: [CUDA] Driver changes to support CUDA compilation on MacOS.

2016-11-17 Thread Justin Lebar via cfe-commits
jlebar added inline comments. Comment at: clang/lib/Driver/Driver.cpp:479 +// the device toolchain we create depends on both. +ToolChain *&CudaTC = ToolChains[CudaTriple.str() + "/" + HostTriple.str()]; +if (!CudaTC) { sfantao wrote: > I am not sure I

[PATCH] D26780: [CUDA] Wrapper header changes necessary to support MacOS.

2016-11-16 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. https://reviews.llvm.org/D26780 Files: clang/lib/Headers/__clang_cuda_cmath.h clang/lib/Headers/__clang_cuda_runtime_wrapper.h Index: clang/lib/Headers/__clang_cuda_runtime_wrapper.h

[PATCH] D26777: [CUDA] Use the right section and constant names for fatbins when compiling for macos.

2016-11-16 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. https://reviews.llvm.org/D26777 Files: clang/lib/CodeGen/CGCUDANV.cpp Index: clang/lib/CodeGen/CGCUDANV.cpp === --- clang/lib/Cod

[PATCH] D26776: [CUDA] Initialize our header search using the host triple.

2016-11-16 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added a subscriber: cfe-commits. This used to work because system headers are found in a (somewhat) predictable set of locations on Linux. But this is not the case on MacOS; without this change, we don't look in the right places f

[libcxx] r287041 - [libcxx] Mark xonstexpr-fns.pass.cpp as XFAIL: gcc.

2016-11-15 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Tue Nov 15 16:03:29 2016 New Revision: 287041 URL: http://llvm.org/viewvc/llvm-project?rev=287041&view=rev Log: [libcxx] Mark xonstexpr-fns.pass.cpp as XFAIL: gcc. This fails with gcc because __builtin_isnan and friends, which libcpp_isnan and friends call, are not themselves

[PATCH] D25403: [CUDA] Mark __libcpp_{isnan, isinf, isfinite} as constexpr.

2016-11-15 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL287012: [CUDA] Mark __libcpp_{isnan,isinf,isfinite} as constexpr. (authored by jlebar). Changed prior to commit: https://reviews.llvm.org/D25403?vs=77271&id=78041#toc Repository: rL LLVM https://rev

[libcxx] r287012 - [CUDA] Mark __libcpp_{isnan, isinf, isfinite} as constexpr.

2016-11-15 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Tue Nov 15 13:15:57 2016 New Revision: 287012 URL: http://llvm.org/viewvc/llvm-project?rev=287012&view=rev Log: [CUDA] Mark __libcpp_{isnan,isinf,isfinite} as constexpr. Summary: This makes these functions available on host and device, which is necessary to compile for the d

[PATCH] D25403: [CUDA] Mark __libcpp_{isnan, isinf, isfinite} as constexpr.

2016-11-15 Thread Justin Lebar via cfe-commits
jlebar marked 6 inline comments as done. jlebar added a comment. Capturing an IRC conversation: > **EricWF** jlebar: Did you test this patch with older Clangs w/o constexpr > builtins? > **jlebar** EricWF, Do you mean, did I test the test, or did I test that the > non-test change does what I n

[PATCH] D25403: [CUDA] Mark __libcpp_{isnan, isinf, isfinite} as constexpr.

2016-11-15 Thread Justin Lebar via cfe-commits
jlebar added a comment. Thanks for the review. Comment at: libcxx/test/libcxx/numerics/c.math/constexpr-fns.pass.cpp:17 +// true constexpr-ness. + +#include EricWF wrote: > Does GCC offer these as contexpr? If not this needs a `// XFAIL: gcc` Looks like the re

r286313 - [CUDA] Use only the GVALinkage on function definitions.

2016-11-08 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Tue Nov 8 17:45:51 2016 New Revision: 286313 URL: http://llvm.org/viewvc/llvm-project?rev=286313&view=rev Log: [CUDA] Use only the GVALinkage on function definitions. Summary: Previously we'd look at the GVALinkage of whatever FunctionDecl you happened to be calling. This i

[PATCH] D26268: [CUDA] Use only the GVALinkage on function definitions.

2016-11-08 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. jlebar marked 2 inline comments as done. Closed by commit rL286313: [CUDA] Use only the GVALinkage on function definitions. (authored by jlebar). Changed prior to commit: https://reviews.llvm.org/D26268?vs=76808&id=77280#

[PATCH] D26268: [CUDA] Use only the GVALinkage on function definitions.

2016-11-08 Thread Justin Lebar via cfe-commits
jlebar marked 2 inline comments as done. jlebar added a comment. Thank you for the review! Submitting... Comment at: clang/test/SemaCUDA/add-inline-in-definition.cu:13-14 +// +// The trickiness here comes from the fact that the FunctionDecl bar() sees for +// foo() does not ha

[PATCH] D26268: [CUDA] Use only the GVALinkage on function definitions.

2016-11-08 Thread Justin Lebar via cfe-commits
jlebar added a comment. Friendly ping https://reviews.llvm.org/D26268 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D25403: [CUDA] Mark __libcpp_{isnan, isinf, isfinite} as constexpr.

2016-11-08 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 77271. jlebar added a comment. Use TEST_STD_VER macro. https://reviews.llvm.org/D25403 Files: libcxx/include/cmath libcxx/test/libcxx/numerics/c.math/constexpr-fns.pass.cpp Index: libcxx/test/libcxx/numerics/c.math/constexpr-fns.pass.cpp ==

[PATCH] D25403: [CUDA] Mark __libcpp_{isnan, isinf, isfinite} as constexpr.

2016-11-07 Thread Justin Lebar via cfe-commits
jlebar updated this revision to Diff 77105. jlebar added a comment. Add a test. https://reviews.llvm.org/D25403 Files: libcxx/include/cmath libcxx/test/libcxx/numerics/c.math/constexpr-fns.pass.cpp Index: libcxx/test/libcxx/numerics/c.math/constexpr-fns.pass.cpp ==

[PATCH] D25403: [CUDA] Mark __libcpp_{isnan, isinf, isfinite} as constexpr.

2016-11-07 Thread Justin Lebar via cfe-commits
jlebar added a comment. Hal, whadya think? https://reviews.llvm.org/D25403 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D26268: [CUDA] Use only the GVALinkage on function definitions.

2016-11-02 Thread Justin Lebar via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. jlebar added subscribers: rsmith, cfe-commits. Previously we'd look at the GVALinkage of whatever FunctionDecl you happened to be calling. This is not right. In the absence of the gnu_inline attribute, to be handled separately, the func

r285412 - Relax assertion in FunctionDecl::doesDeclarationForceExternallyVisibleDefinition.

2016-10-28 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Fri Oct 28 11:46:39 2016 New Revision: 285412 URL: http://llvm.org/viewvc/llvm-project?rev=285412&view=rev Log: Relax assertion in FunctionDecl::doesDeclarationForceExternallyVisibleDefinition. Previously we were asserting that this declaration doesn't have a body *and* won'

[PATCH] D25640: [CUDA] [AST] Allow isInlineDefinitionExternallyVisible to be called on functions without bodies.

2016-10-28 Thread Justin Lebar via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL285410: [CUDA] [AST] Allow isInlineDefinitionExternallyVisible to be called on… (authored by jlebar). Changed prior to commit: https://reviews.llvm.org/D25640?vs=76118&id=76208#toc Repository: rL LLV

r285410 - [CUDA] [AST] Allow isInlineDefinitionExternallyVisible to be called on functions without bodies.

2016-10-28 Thread Justin Lebar via cfe-commits
Author: jlebar Date: Fri Oct 28 11:26:26 2016 New Revision: 285410 URL: http://llvm.org/viewvc/llvm-project?rev=285410&view=rev Log: [CUDA] [AST] Allow isInlineDefinitionExternallyVisible to be called on functions without bodies. Summary: In CUDA compilation, we call isInlineDefinitionExternally

[PATCH] D25640: [CUDA] [AST] Allow isInlineDefinitionExternallyVisible to be called on functions without bodies.

2016-10-27 Thread Justin Lebar via cfe-commits
jlebar added a comment. OK, I can add new flags with the best of 'em. I got rid of a super ugly hack I found that was working around the same problem I was trying to work around here. (And I verified that if I don't call setWillHaveBody, a testcase fails.) I can split this out into two patche

  1   2   3   4   5   6   7   8   9   >