[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2023-01-26 Thread Alexander Kornienko via Phabricator via cfe-commits
alexfh added a comment. In D140158#4082804 , @jhuber6 wrote: > In D140158#4082789 , @alexfh wrote: > >> This patch breaks our cuda compilations. The output file isn't created after >> it: >> >> $ echo 'extern "

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2023-01-26 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D140158#4082789 , @alexfh wrote: > This patch breaks our cuda compilations. The output file isn't created after > it: > > $ echo 'extern "C" __attribute__((global)) void q() {}' >q.cc > $ good-clang \ > -nocudainc -x

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2023-01-26 Thread Alexander Kornienko via Phabricator via cfe-commits
alexfh added a comment. This patch breaks our cuda compilations. The output file isn't created after it: $ echo 'extern "C" __attribute__((global)) void q() {}' >q.cc $ good-clang \ -nocudainc -x cuda \ --cuda-path=somepath/cuda/ \ -Wno-unknown-cuda-version --cuda-device-onl

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2023-01-18 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. jhuber6 marked 3 inline comments as done. Closed by commit rG0660397e6809: [CUDA] Allow targeting NVPTX directly without a host toolchain (authored by jhuber6). Change

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2023-01-18 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 marked 3 inline comments as done. jhuber6 added a comment. In D140158#4063720 , @tra wrote: > LGTM with few minor nits and questions. > > In D140158#4063689 , @jhuber6 wrote: > >> Addressing some comments.

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2023-01-18 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM with few minor nits and questions. In D140158#4063689 , @jhuber6 wrote: > Addressing some comments. I don't know if there's a cleaner way to mess > ar

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2023-01-18 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 490317. jhuber6 added a comment. Addressing some comments. I don't know if there's a cleaner way to mess around with the `.cubin` nonsense. I liked symbolic links but that doesn't work on Windows. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST A

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2023-01-18 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 marked an inline comment as done. jhuber6 added inline comments. Comment at: clang/lib/Driver/ToolChains/Cuda.h:196-197 + + void AddCudaIncludeArgs(const llvm::opt::ArgList &DriverArgs, llvm::opt::ArgStringList &CC1Args) const override; ---

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2023-01-18 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/lib/Driver/ToolChains/Cuda.cpp:448-450 + // If we are invoking `nvlink` internally we need to output a `.cubin` file. + // Checking if the output is a temporary is the cleanest way to determine + // this. Putting this logic in `

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2023-01-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM overall, with few nits. Comment at: clang/lib/Driver/ToolChains/Cuda.cpp:448-450 + // If we are invoking `nvlink` internally we need to output a `.cubin` file. + // Checking if the output is a temporary is the cleanest way to determine + // this. Pu

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2023-01-16 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment. FWIW, creating CUBIN from C/C++ directly would be really useful when debugging (and in combination with our soon to be available JIT object loader). @tra Can we get this in somehow? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.ll

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2023-01-16 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 489554. jhuber6 added a comment. Updating. Used a different method to determine if we need to use `.cubin` or `.o`. It's a little ugly but I don't think there's a better way to do it. Also I just realized that if this goes through I could probably heavily s

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2022-12-16 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 483576. jhuber6 added a comment. Accidentally deleted the old `getInputFilename` routine which we need. The symlink worked fine but would break on Windows, so I ended up writing a hack that would only use `.cubin` if we have the `nvlink` linker active and the

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2022-12-16 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 483507. jhuber6 added a comment. Fix format Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140158/new/ https://reviews.llvm.org/D140158 Files: clang/lib/Driver/Driver.cpp clang/lib/Driver/ToolChains/Cuda.cp

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2022-12-15 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 483389. jhuber6 added a comment. Addressing comments, I did the symbolic link method. It's a stupid hack that's only necessary because of nvlink's poor handling but I think this works around it well enough. Repository: rG LLVM Github Monorepo CHANGES SI

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2022-12-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D140158#3999810 , @JonChesterfield wrote: > I don't think we should assume they want implicit behaviour from other > programming models thrown in. Agreed. Also, removing things is often surprisingly hard. Let's keep things simlp

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2022-12-15 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. If we do magic header including, we should check for the freestanding argument and not include them with that set. I would prefer we not include cuda headers into C++ source that isn't being compiled as cuda, and also not link in misc cuda library files. Anyone

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2022-12-15 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D140158#3999783 , @tra wrote: > In D140158#3999716 , @jhuber6 wrote: > >> I just realized the method of copying the `.o` to a `.cubin` doesn't work if >> the link step is done in the s

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2022-12-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D140158#3999716 , @jhuber6 wrote: > I just realized the method of copying the `.o` to a `.cubin` doesn't work if > the link step is done in the same compilation because it doesn't exist yet. > To fix this I could either make the

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2022-12-15 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment. In D140158#3999716 , @jhuber6 wrote: > Also do you think I should include the CUDA headers in with this? We can > always get rid of them with `nogpuinc` or similar if they're not needed. The > AMDGPU compilation still links in

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2022-12-15 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 marked an inline comment as done. jhuber6 added a comment. I just realized the method of copying the `.o` to a `.cubin` doesn't work if the link step is done in the same compilation because it doesn't exist yet. To fix this I could either make the tool chain emit `.cubin` if we're going

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2022-12-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM overall. So, essentially the patch refactors what we've already been doing for OpenMP and made it usable manually, which will be useful for things like GPU-side libc tests. Comment at: clang/lib/Driver/ToolChains/Cuda.cpp:631-632 + + const char *Ex

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2022-12-15 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: tra, yaxunl, JonChesterfield. Herald added subscribers: mattd, gchakrabarti, carlosgalvezp, asavonic. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, sstefan1, MaskRay. Herald adde