[PATCH] D47757: [Sema] Produce diagnostics when unavailable aligned allocation/deallocation functions are called

2018-08-09 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. +tra in the hopes that perhaps he's comfortable reviewing this (sorry that I'm not). Repository: rC Clang https://reviews.llvm.org/D47757 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-b

[PATCH] D46993: [CUDA] Make std::min/max work when compiling in C++14 mode with a C++11 stdlib.

2018-05-16 Thread Justin Lebar via Phabricator via cfe-commits
jlebar created this revision. jlebar added a reviewer: rsmith. Herald added a subscriber: sanjoy. https://reviews.llvm.org/D46993 Files: clang/lib/Headers/cuda_wrappers/algorithm Index: clang/lib/Headers/cuda_wrappers/algorithm =

[PATCH] D46994: [test-suite] Test CUDA in C++14 mode with C++11 stdlibs.

2018-05-16 Thread Justin Lebar via Phabricator via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. Herald added subscribers: llvm-commits, mgorny, sanjoy. Previously (https://reviews.llvm.org/D46993) std::min/max didn't work in C++14 mode with a C++11 stdlib; we'd assumed that compiler std=c++14 implied stdlib in C++14 mode. Reposit

[PATCH] D46995: [test-suite] Enable CUDA complex tests with libc++ now that D25403 is resolved.

2018-05-16 Thread Justin Lebar via Phabricator via cfe-commits
jlebar created this revision. jlebar added a reviewer: tra. Herald added subscribers: llvm-commits, sanjoy. Herald added a reviewer: EricWF. Repository: rT test-suite https://reviews.llvm.org/D46995 Files: External/CUDA/complex.cu Index: External/CUDA/complex.cu ===

[PATCH] D46993: [CUDA] Make std::min/max work when compiling in C++14 mode with a C++11 stdlib.

2018-05-17 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. Thank you for the review! https://reviews.llvm.org/D46993 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46993: [CUDA] Make std::min/max work when compiling in C++14 mode with a C++11 stdlib.

2018-05-17 Thread Justin Lebar via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC332619: [CUDA] Make std::min/max work when compiling in C++14 mode with a C++11 stdlib. (authored by jlebar, committed by ). Changed prior to commit: https://reviews.llvm.org/D46993?vs=147224&id=147330#

[PATCH] D46782: [CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces.

2018-05-17 Thread Justin Lebar via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC332621: [CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces. (authored by jlebar, committed by ). Herald added a subscriber: cfe-commits. Changed prior to commit: https://reviews.llvm.org

[PATCH] D46995: [test-suite] Enable CUDA complex tests with libc++ now that D25403 is resolved.

2018-05-17 Thread Justin Lebar via Phabricator via cfe-commits
jlebar marked an inline comment as done. jlebar added inline comments. Comment at: External/CUDA/complex.cu:24 // libstdc++ (compile errors in ). -#if __cplusplus >= 201103L && !defined(_LIBCPP_VERSION) && \ -(__cplusplus < 201402L || STDLIB_VERSION >= 2014) +#if __cplusplus

[PATCH] D46994: [test-suite] Test CUDA in C++14 mode with C++11 stdlibs.

2018-05-17 Thread Justin Lebar via Phabricator via cfe-commits
jlebar marked an inline comment as done. jlebar added a comment. Thanks for the reviews, Art. Submitting with this change... Repository: rT test-suite https://reviews.llvm.org/D46994 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http:/

[PATCH] D46994: [test-suite] Test CUDA in C++14 mode with C++11 stdlibs.

2018-05-17 Thread Justin Lebar via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL332659: [test-suite] Test CUDA in C++14 mode with C++11 stdlibs. (authored by jlebar, committed by ). Changed prior to commit: https://reviews.llvm.org/D46994?vs=147225&id=147383#toc Repository: rL L

[PATCH] D46995: [test-suite] Enable CUDA complex tests with libc++ now that D25403 is resolved.

2018-05-17 Thread Justin Lebar via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. jlebar marked an inline comment as done. Closed by commit rL332660: [test-suite] Enable CUDA complex tests with libc++ now that D25403 is resolved. (authored by jlebar, committed by ). Repository: rL LLVM https://reviews

[PATCH] D47070: [CUDA] Upgrade linked bitcode to enable inlining

2018-05-18 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. I defer to Art on this one. Repository: rC Clang https://reviews.llvm.org/D47070 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D38188: [CUDA] Fix names of __nvvm_vote* intrinsics.

2017-09-25 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. Should we add tests to the test-suite? Or, are these already caught by the existing tests we have? https://reviews.llvm.org/D38188 ___ cfe-comm

[PATCH] D38191: [NVPTX] added match.{any, all}.sync instructions, intrinsics & builtins.

2017-09-25 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added inline comments. Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:419 +TARGET_BUILTIN(__nvvm_match_any_sync_i64, "WiUiWi", "", "ptx60") +// These return a pair {value, predicate} which requires custom lowering. +TARGET_BUILTIN(__nvvm_match_all_sync_i32p, "UiUi

[PATCH] D38468: [CUDA] Fix name of __activemask()

2017-10-02 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. Thank you for the fix! https://reviews.llvm.org/D38468 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mail

[PATCH] D38742: [CUDA] Added __hmma_m16n16k16_* builtins to support mma instructions in sm_70

2017-10-11 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:9726 + case NVPTX::BI__hmma_m16n16k16_ld_c_f16: +case NVPTX::BI__hmma_m16n16k16_ld_c_f32:{ +Address Dst = EmitPointer

[PATCH] D38742: [CUDA] Added __hmma_m16n16k16_* builtins to support mma instructions in sm_70

2017-10-11 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:9733 + return nullptr; +bool isColMajor = isColMajorArg.getZExtValue(); +unsigned IID; tra wrote: > jlebar wrote: > > Urg, this isn't a bool? Do we want it to be? > There are

[PATCH] D38816: Convert clang::LangAS to a strongly typed enum

2017-10-11 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. My only regret is that I have but one +1 to give to this patch. Comment at: include/clang/Basic/AddressSpaces.h:51 +namespace LanguageAS { /// The type of a lookup table which maps from language-specific address spaces I wonder if you

[PATCH] D38816: Convert clang::LangAS to a strongly typed enum

2017-10-11 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > The only reason I added this namespace is that I wasn't sure whether having > those functions in the clang namespace is acceptable. Maybe someone else will object, or suggest an existing namespace they should be in. FWIW I think it's fine. > Not quite sure what to ca

[PATCH] D39005: [OpenMP] Clean up variable and function names for NVPTX backend

2017-10-17 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. This has been tried twice before, see https://reviews.llvm.org/D29883 and https://reviews.llvm.org/D17738. I'm as unhappy about this as anyone, and personally I don't have any preference about how we try to solve it. But I think we shouldn't check this in without heari

[PATCH] D39005: [OpenMP] Clean up variable and function names for NVPTX backend

2017-10-17 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > I'd be interested to get the ball rolling in regard to coming up with a fix > for this. I see some suggestions in past patches. Some help/clarification > would be much appreciated. Happy to help, but I'm not sure what to offer beyond the link in Art's previous comment

[PATCH] D39005: [OpenMP] Clean up variable and function names for NVPTX backend

2017-10-18 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > The first question that comes to mind is what is the link between data layout > and name mangling conventions? I pulled up http://llvm.org/doxygen/classllvm_1_1DataLayout.html and searched for "mangling" -- presumably this is what they were referring to. We also don'

[PATCH] D49763: [CUDA] Call atexit() for CUDA destructor early on.

2018-07-24 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/CodeGen/CGCUDANV.cpp:379 + // Create destructor and register it with atexit() the way NVCC does it. Doing + // it during regular destructor phase

[PATCH] D43602: [CUDA] Added missing functions.

2018-02-21 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. For my information, how are we verifying that we've caught everything? https://reviews.llvm.org/D43602 ___ cfe-commits mailing list cfe-commits@l

[PATCH] D41521: [CUDA] fixes for __shfl_* intrinsics.

2017-12-21 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. Since this is tricky and we've seen it affecting user code, do you think it's a bad idea to add tests to the test-suite? https://reviews.llvm.org/D41521 ___

[PATCH] D41788: [DeclPrinter] Fix two cases that crash clang -ast-print.

2018-01-08 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. I strongly approve of fixing these crashes, but I don't think I can say with confidence whether this change is correct. https://reviews.llvm.org/D41788 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llv

[PATCH] D47804: [CUDA] Replace 'nv_weak' attributes in CUDA headers with 'weak'.

2018-06-05 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. What could possibly go wrong. https://reviews.llvm.org/D47804 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-b

[PATCH] D48036: [CUDA] Make min/max shims host+device.

2018-06-11 Thread Justin Lebar via Phabricator via cfe-commits
jlebar created this revision. jlebar added a reviewer: rsmith. Herald added a subscriber: sanjoy. Fixes PR37753: min/max can't be called from __host__ __device__ functions in C++14 mode. Testcase in a separate test-suite commit. https://reviews.llvm.org/D48036 Files: clang/lib/Headers/cuda_w

[PATCH] D48037: [CUDA] Add tests to ensure that std::min/max can be called from __host__ __device__ functions.

2018-06-11 Thread Justin Lebar via Phabricator via cfe-commits
jlebar created this revision. jlebar added a reviewer: rsmith. Herald added subscribers: llvm-commits, sanjoy. Tests for https://reviews.llvm.org/D48036 / PR37753. Repository: rT test-suite https://reviews.llvm.org/D48037 Files: External/CUDA/algorithm.cu Index: External/CUDA/algorithm.c

[PATCH] D48036: [CUDA] Make min/max shims host+device.

2018-06-13 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > Last comment in the bug pointed out that those overloads should be constexpr > in c++14. Maybe in a separate patch, though. Yeah, would prefer to do it in a separate patch. It's possible that having constexpr min/max in C++14 mode *without a C++14 standard library* wi

[PATCH] D48151: [CUDA] Make __host__/__device__ min/max overloads constexpr in C++14.

2018-06-13 Thread Justin Lebar via Phabricator via cfe-commits
jlebar created this revision. jlebar added reviewers: rsmith, tra. Herald added a subscriber: sanjoy. Tests in a separate change to the test-suite. https://reviews.llvm.org/D48151 Files: clang/lib/Headers/cuda_wrappers/algorithm Index: clang/lib/Headers/cuda_wrappers/algorithm =

[PATCH] D48152: [CUDA] Add tests that, in C++14 mode, min/max are constexpr.

2018-06-13 Thread Justin Lebar via Phabricator via cfe-commits
jlebar created this revision. jlebar added reviewers: rsmith, tra. Herald added a subscriber: llvm-commits. Repository: rT test-suite https://reviews.llvm.org/D48152 Files: External/CUDA/algorithm.cu Index: External/CUDA/algorithm.cu

[PATCH] D48036: [CUDA] Make min/max shims host+device.

2018-06-13 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. In https://reviews.llvm.org/D48036#1131279, @tra wrote: > Ack. Patches sent (see dependency chain in phab). https://reviews.llvm.org/D48036 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-

[PATCH] D48036: [CUDA] Make min/max shims host+device.

2018-06-15 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. @rsmith friendly ping on this one. https://reviews.llvm.org/D48036 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D48151: [CUDA] Make __host__/__device__ min/max overloads constexpr in C++14.

2018-06-15 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. @rsmith friendly ping on this review. https://reviews.llvm.org/D48151 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D48151: [CUDA] Make __host__/__device__ min/max overloads constexpr in C++14.

2018-06-15 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. In https://reviews.llvm.org/D48151#1133954, @rsmith wrote: > LGTM Thank you for the review, Richard. Will check this in once the whole stack is ready -- just need https://reviews.llvm.org/D48036. https://reviews.llvm.org/D48151

[PATCH] D57487: [CUDA] Propagate detected version of CUDA to cc1

2019-01-30 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added inline comments. This revision is now accepted and ready to land. Comment at: clang/include/clang/Basic/Cuda.h:108 +enum class CudaFeature { + CUDA_USES_NEW_LAUNCH, +}; Should this enum be documented? ===

[PATCH] D57488: [CUDA] add support for the new kernel launch API in CUDA-9.2+.

2019-01-30 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. LGTM, mostly nits. Comment at: clang/include/clang/Sema/Sema.h:10316 + /// Returns the name of the launch configuration function. + std::string getCudaConfigureFuncName()

[PATCH] D59647: [CUDA][HIP] Warn shared var initialization

2019-03-21 Thread Justin Lebar via Phabricator via cfe-commits
jlebar requested changes to this revision. jlebar added a comment. This revision now requires changes to proceed. I agree with Art. The fact that nvcc allows this is broken. If you want a flag that makes this error a warning, that might work for me. The flag should probably say "unsafe" or "I

[PATCH] D59647: [CUDA][HIP] Warn shared var initialization

2019-03-21 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > By default it is still treated as error, therefore no behavior change of > clang. Oh, I see, you already did what I'd suggested. :) That's better. I think this needs to be made *much scarier* though. "Maybe race condition" doesn't capture the danger here -- you can

[PATCH] D59900: [Sema] Fix a crash when nonnull checking

2019-03-27 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. I uh... I also think this is an @rsmith question, I have no idea. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D59900/new/ https://reviews.llvm.org/D59900 ___ cfe-commits maili

[PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

2019-05-02 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a subscriber: rsmith. jlebar added a comment. Here's one for you: __host__ float bar(); __device__ int bar(); __host__ __device__ auto foo() -> decltype(bar()) {} What is the return type of `foo`? :) I don't believe the right answer is, "float when compiling for host, int wh

[PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

2019-05-03 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > At [nvcc] from CUDA 10, that's not acceptable as we are declaring two > functions only differ from the return type. It seems CUDA attributes do not > contribute to the function signature. clang is quite different here. Yes, this is an intentional and more relaxed seman

[PATCH] D51809: [CUDA][HIP] Fix assertion in LookupSpecialMember

2018-09-21 Thread Justin Lebar via Phabricator via cfe-commits
jlebar requested changes to this revision. jlebar added subscribers: timshen, rsmith. jlebar added a comment. This revision now requires changes to proceed. Sorry for missing tra's ping earlier, I get a lot of HIP email traffic that's 99% inactionable by me, so I didn't notice my username in tra'

[PATCH] D51809: [CUDA][HIP] Fix ShouldDeleteSpecialMember for inherited constructors

2018-10-06 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added inline comments. This revision is now accepted and ready to land. Comment at: lib/Sema/SemaDeclCXX.cpp:7231 +if (ICI) + CSM = getSpecialMember(MD); + LGTM, but perhaps we should use a new variable instead of mo

[PATCH] D57771: [CUDA] Add basic support for CUDA-10.1

2019-02-05 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/CodeGen/CGCUDANV.cpp:620 + +// CUDA version requires calling __cudaRegisterFatBinaryEnd(Handle); +if (CudaFeatureEnabled(CGM.getTarget().get

[PATCH] D37539: [CUDA] Add device overloads for non-placement new/delete.

2017-09-06 Thread Justin Lebar via Phabricator via cfe-commits
jlebar marked an inline comment as done. jlebar added inline comments. Comment at: clang/lib/Headers/cuda_wrappers/new:79 +} +__device__ void operator delete[](void *ptr, std::size_t sz) CUDA_NOEXCEPT { + ::operator delete(ptr); tra wrote: > Is std::size_t inten

[PATCH] D37539: [CUDA] Add device overloads for non-placement new/delete.

2017-09-06 Thread Justin Lebar via Phabricator via cfe-commits
jlebar updated this revision to Diff 114104. jlebar marked an inline comment as done. jlebar added a comment. Address review comments. https://reviews.llvm.org/D37539 Files: clang/lib/Headers/cuda_wrappers/new Index: clang/lib/Headers/cuda_wrappers/new ==

[PATCH] D37540: [CUDA] Tests for device-side overloads of non-placement new/delete.

2017-09-06 Thread Justin Lebar via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL312682: [CUDA] Tests for device-side overloads of non-placement new/delete. (authored by jlebar). Changed prior to commit: https://reviews.llvm.org/D37540?vs=114094&id=114109#toc Repository: rL LLVM

[PATCH] D37539: [CUDA] Add device overloads for non-placement new/delete.

2017-09-06 Thread Justin Lebar via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL312681: [CUDA] Add device overloads for non-placement new/delete. (authored by jlebar). Changed prior to commit: https://reviews.llvm.org/D37539?vs=114104&id=114108#toc Repository: rL LLVM https://r

[PATCH] D37548: [CUDA] When compilation fails, print the compilation mode.

2017-09-06 Thread Justin Lebar via Phabricator via cfe-commits
jlebar created this revision. Herald added a subscriber: sanjoy. That is, instead of "1 error generated", we now say "1 error generated when compiling for sm_35". This (partially) solves a usability foogtun wherein e.g. users call a function that's only defined on sm_60 when compiling for sm_35,

[PATCH] D37576: [CUDA] Added rudimentary support for CUDA-9 and sm_70.

2017-09-07 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. Looks great. https://reviews.llvm.org/D37576 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listin

[PATCH] D37548: [CUDA] When compilation fails, print the compilation mode.

2017-09-07 Thread Justin Lebar via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL312736: [CUDA] When compilation fails, print the compilation mode. (authored by jlebar). Changed prior to commit: https://reviews.llvm.org/D37548?vs=114112&id=114222#toc Repository: rL LLVM https://

[PATCH] D37906: [CUDA] Work around a new quirk in CUDA9 headers.

2017-09-15 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. This is a bit of a Chesterton's Fence -- do we know why they're doing this? I guess it's probably going to be OK because our overriding semantics will make it OK, and our test-suite tests (should) exercise all of math.h. But I'm still a little worried about it. https:

[PATCH] D37906: [CUDA] Work around a new quirk in CUDA9 headers.

2017-09-15 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. > BTW, this change essentially augments the job that the "#undef GNUC" above > used to do in older CUDA versions. CUDA9 replaced GNUC with _GLIBCXX_MATH_H > in CUDA-9 in some places. Ah, that

[PATCH] D38090: [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins.

2017-09-20 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added inline comments. Comment at: clang/lib/Headers/__clang_cuda_intrinsics.h:161 +#endif // __CUDA_VERSION >= 9000 && (!defined(__CUDA_ARCH__) || __CUDA_ARCH__ >= + // 300) + Nit, better linebreaking in the comment? Comment at:

[PATCH] D38113: OpenCL: Assume functions are convergent

2017-09-20 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. LGTM for the changes other than the test (I don't read opencl). https://reviews.llvm.org/D38113 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D38147: [CUDA] Fixed order of words in the names of shfl builtins.

2017-09-21 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. Naturally they're different orders in the PTX and CUDA. :) https://reviews.llvm.org/D38147 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D38113: OpenCL: Assume functions are convergent

2017-09-21 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > The problem of adding this attribute conservatively for all functions is that > it prevents some optimizations to happen. function-attrs removes the convergent attribute from anything it can prove does not call a convergent function. I agree this is a nonoptimal solut

[PATCH] D38113: OpenCL: Assume functions are convergent

2017-09-22 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > Yes, that's why if it would be responsibility of the kernel developer to > specify this explicitly we could avoid this complications in the compiler. > But if we add it into the language now we still need to support the > correctness for the code written with the earli

[PATCH] D56033: [CUDA] Treat extern global variable shadows same as regular extern vars.

2018-12-21 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added inline comments. This revision is now accepted and ready to land. Comment at: clang/test/CodeGenCUDA/device-stub.cu:51 +// external device-side variables with definitiions should generate +// definitions for the shadows. -

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-08 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. Without reading the patch in detail (sorry) but looking mainly at the testcase: It looks like we're not checking how overloading and `__host__ __device__` functions play into this. Maybe there are some additional edge-cases to explore/check. Just some examples: Will w

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-08 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. __host__ void bar() {} __device__ int bar() { return 0; } __host__ __device__ void foo() { int x = bar(); } template __global__ void kernel() { devF();} kernel(); > we DTRT for this case. Here __host__ bar needs to return int since foo() > expects that. wi

[PATCH] D45827: [CUDA] Enable CUDA compilation with CUDA-9.2

2018-04-19 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. Well that was unusually easy... https://reviews.llvm.org/D45827 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi

[PATCH] D48036: [CUDA] Make min/max shims host+device.

2018-06-25 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. @rsmith friendly ping on this one. https://reviews.llvm.org/D48036 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47757: [Sema] Produce diagnostics when unavailable aligned allocation/deallocation functions are called

2018-06-25 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > @jlebar, is the change I made to call-host-fn-from-device.cu correct? I don't think so -- that's a change in overloading behavior afaict. Repository: rC Clang https://reviews.llvm.org/D47757 ___ cfe-commits mailing list

[PATCH] D47757: [Sema] Produce diagnostics when unavailable aligned allocation/deallocation functions are called

2018-06-25 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. In https://reviews.llvm.org/D47757#1142886, @ahatanak wrote: > I mean ToT clang (without my patch applied) seems to select the non-sized > host version 'T::operator delete(void*)'. OK, if this is just making an error out of something which previously silently didn't wo

[PATCH] D48036: [CUDA] Make min/max shims host+device.

2018-06-29 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > Looks right to me (other than the missing constexpr in C++14 onwards). Though > this is subtle enough that I suspect the only way to know for sure is to try > it. Thanks a lot, Richard. FTR the missing constexpr is in https://reviews.llvm.org/D48151. https://review

[PATCH] D48036: [CUDA] Make min/max shims host+device.

2018-06-29 Thread Justin Lebar via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL336025: [CUDA] Make min/max shims host+device. (authored by jlebar, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D48036?vs=150790&id=153593#

[PATCH] D48151: [CUDA] Make __host__/__device__ min/max overloads constexpr in C++14.

2018-06-29 Thread Justin Lebar via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL336026: [CUDA] Make __host__/__device__ min/max overloads constexpr in C++14. (authored by jlebar, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.

[PATCH] D48152: [CUDA] Add tests that, in C++14 mode, min/max are constexpr.

2018-06-29 Thread Justin Lebar via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL336030: [CUDA] Add tests that, in C++14 mode, min/max are constexpr. (authored by jlebar, committed by ). Repository: rL LLVM https://reviews.llvm.org/D48152 Files: test-suite/trunk/External/CUDA/al

[PATCH] D48037: [CUDA] Add tests to ensure that std::min/max can be called from __host__ __device__ functions.

2018-06-29 Thread Justin Lebar via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL336029: [CUDA] Add tests to ensure that std::min/max can be called from __host__… (authored by jlebar, committed by ). Repository: rL LLVM https://reviews.llvm.org/D48037 Files: test-suite/trunk/Ext

[PATCH] D83893: [CUDA][HIP] Always defer diagnostics for wrong-sided reference

2020-07-15 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. tra and I talked offline and I...think this makes sense. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D83893/new/ https://reviews.llvm.org/D83893 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.

[PATCH] D85236: [CUDA] Work around a bug in rint() caused by a broken implementation provided by CUDA.

2020-08-04 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. LGTM, and can we write a test in the test-suite? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D85236/new/ https://reviews.llvm.org/D85236 _

[PATCH] D40453: Add the nvidia-cuda-toolkit Debian package path to search path

2017-11-28 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. I defer to tra on this. https://reviews.llvm.org/D40453 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D40673: Add _Float128 as alias to __float128 to enable compilations on Fedora27/glibc2-26

2017-11-30 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. LGTM for the CUDA test changes. https://reviews.llvm.org/D40673 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi

[PATCH] D121259: [clang] Fix CodeGenAction for LLVM IR MemBuffers

2022-03-08 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. Congrats on your first patch! Can Daniele or Shangwu land this for you, or do you need me to? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D1212

[PATCH] D119207: [CUDA][SPIRV] Convert CUDA kernels to SPIR-V kernels

2022-02-07 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > [CUDA][SPIRV] Convert CUDA kernels to SPIR-V kernels Rephrase this? This patch is about kernel *arguments*, right? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D119207/new/ https://reviews.llvm.org/D119207 _

[PATCH] D119207: [CUDA][SPIRV] Convert CUDA kernels to SPIR-V kernels

2022-02-08 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:10322 ABIArgInfo SPIRVABIInfo::classifyKernelArgumentType(QualType Ty) const { - if (getContext().getLangOpts().HIP) { + if (getContext().getLangOpts().CUDAIsDevice) { // Coerce pointer arguments w

[PATCH] D119207: [CUDA][SPIRV] Assign global address space to CUDA kernel arguments

2022-02-17 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. In D119207#3327476 , @shangwuyao wrote: > Thanks for the review, if it looks good, can we get this to land now? > Otherwise more comments are welcome! I'll land this for you! At some point you should get commit access yourself,

[PATCH] D119207: [CUDA][SPIRV] Assign global address space to CUDA kernel arguments

2022-02-17 Thread Justin Lebar via Phabricator via cfe-commits
This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rG9de4fc0f2d3b: [CUDA][SPIRV] Assign global address space to CUDA kernel arguments (authored by shangwuyao, committed by jle

[PATCH] D119207: [CUDA][SPIRV] Assign global address space to CUDA kernel arguments

2022-02-17 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. commit 9de4fc0f2d3b60542956f7e5254951d049edeb1f (HEAD -> main, origin/main, origin/HEAD) Author: Shangwu Yao Date: Thu Feb 17 09:38:06 2022 -0800 [CUDA][SPIRV] Assign global address space to CUDA kernel arguments This patch converts CUDA pointer

[PATCH] D88250: [CUDA] Added dim3/uint3 conversion functions to builtin vars.

2020-09-24 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. I know it comes in a separate change, but can we add a check to the test-suite? Comment at: clang/lib/Headers/__clang_cuda_runtime_wrapper.h:381 +__device__ inline __cuda_bui

[PATCH] D88345: [CUDA] Allow local `static const {__constant__, __device__}` variables.

2020-09-25 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. wha... As you know, `const` doesn't mean anything, that can be const-casted away. And then you'll be able to observe that this nominally-static variable is just a normal variable. Since this doesn't make sense and contradicts their documentation, I'm tempted to say thi

[PATCH] D88345: [CUDA] Allow local `static const {__constant__, __device__}` variables.

2020-09-28 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. OK, backing up, what are the semantics of `static` on `__constant__`, `__device__`, and `__shared__`? - My understanding is that `__shared__` behaves the same whether or not it's static. It's not equivalent to `namespace a { __shared__ int c = 4; }`, because that's ill

[PATCH] D88345: [CUDA] Allow local `static const {__constant__, __device__}` variables.

2020-09-28 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. OK, now I'm starting to I understand this change.. Before, in function scope, we allow static const/non-const `__shared__`, and allow static const so long as it's not `__device__` or `__constant__`. - `static` -> error? (I understood us saying above that it is, but now

[PATCH] D90409: [HIP] Math Headers to use type promotion

2020-11-03 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > LGTM. I think the change would make sense for CUDA, too. @jlebar - WDYT? I agree that the C and C++ standard libraries should behave the same in CUDA mode and host mode! But if doing so would make our behavior different than nvcc's, maybe we could emit a warning or so

[PATCH] D91590: [NVPTX] Efficently support dynamic index on CUDA kernel aggregate parameters.

2020-11-17 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. I am legit excited about this if we could figure out how to make it work, but I don't have anything to add beyond what tra said. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D91590/new/ https://reviews.llvm.org/D91590

[PATCH] D91807: [CUDA] Unbreak CUDA compilation with -std=c++20

2020-11-19 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. This revision is now accepted and ready to land. How fun. :) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D91807/new/ https://reviews.llvm.org/D91807 _

[PATCH] D88345: [CUDA] Allow local `static const {__constant__, __device__}` variables.

2020-09-28 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > It should. I did mention in a previous comment that > Looks like the > const-ness check should not be there, either. I need to revise the patch. Heh, okay. Sorry I missed that, somehow this patch was confusing to me. > Except that NVCC allows non-const __constant__, t

[PATCH] D88668: [CUDA] Add support for 11.1

2020-10-01 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > It looks like 11.1 doesn't have a version.txt file Yikes, this is a problem if we can't tell the difference between CUDA versions! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D88668/new/ https://reviews.llvm.org/D88668

[PATCH] D88345: [CUDA] Allow local `static const {__constant__, __device__}` variables.

2020-10-02 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. Hey, I'm leaving on a vacation tomorrow and didn't have a chance to get to this review today. Is that ok? I'm not bringing my work laptop, but I could look at it on my personal laptop. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.ll

[PATCH] D88345: [CUDA] Allow local `static const {__constant__, __device__}` variables.

2020-10-13 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added inline comments. This revision is now accepted and ready to land. Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:8163 "%select{__device__|__global__|__host__|__host__ __device__}0 functions">; -def err_cuda_nonglobal_

[PATCH] D89832: [CUDA] Extract CUDA version from cuda.h if version.txt is not found

2020-10-22 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. LGTM modulo emankov's comment. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D89832/new/ https://reviews.llvm.org/D89832 ___ cfe-commits mailing list

[PATCH] D95754: [clang] Print 32 candidates on the first failure, with -fshow-overloads=best.

2021-01-30 Thread Justin Lebar via Phabricator via cfe-commits
jlebar created this revision. jlebar requested review of this revision. Herald added a project: clang. Previously, -fshow-overloads=best always showed 4 candidates. The problem is, when this isn't enough, you're kind of up a creek; the only option available is to recompile with different flags.

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-21 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > One alternative would be to use run-time dispatch, but, given that texture > lookup is a single instruction, the overhead would be > substantial-to-prohibitive. I guess I'm confused... Is the parameter value that we're "overloading" on usually/always a constant? In

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-22 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. > Depending on which particular operation is used, the arguments vary, too. So something like T __nv_tex_surf_handler(name, arg1) { switch (name) { ... default: panic(); } } T __nv_tex_surf_handler(name, arg1, arg2) { switch(...) {

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-24 Thread Justin Lebar via Phabricator via cfe-commits
jlebar accepted this revision. jlebar added a comment. Okay, I give up on the phab interface. It's unreadable with all the existing comments and lint errors. Hope you don't mind comments this way. I'm just going to put it all in a giant code block so it doesn't get wrapped or whatever. +// _

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-24 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. Presumably as a separate commit we should add tests to the test_suite repository to ensure that this at least still compiles with different versions of CUDA? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https:

[PATCH] D95754: [clang] Print 32 candidates on the first failure, with -fshow-overloads=best.

2021-02-13 Thread Justin Lebar via Phabricator via cfe-commits
jlebar added a comment. Not sure who can review this, but looking through blame it seems like maybe @aaronpuchert? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D95754/new/ https://reviews.llvm.org/D95754 __

  1   2   3   >