[PATCH] D123810: [Cuda] Add initial support for wrapping CUDA images in the new driver.

Artem Belevich via Phabricator via cfe-commits Tue, 10 May 2022 14:52:50 -0700

tra added inline comments.


================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:188
+  static inline OffloadKind getEmptyKey() {
+    return static_cast<OffloadKind>(0xFFFF);
+  }
----------------
Extend `enum OffloadKind` to include these special kinds. 


================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:201
   static DeviceFile getEmptyKey() {
-    return {DenseMapInfo<StringRef>::getEmptyKey(),
+    return {static_cast<OffloadKind>(DenseMapInfo<uint16_t>::getEmptyKey()),
             DenseMapInfo<StringRef>::getEmptyKey(),
----------------
Why do we limit ourselves to uint16_t here? Can't we just use `OffloadKind` 
itself and get rid of these casts?


================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:726
+
+  if (Error Err = executeCommands(*FatBinaryPath, CmdArgs))
+    return std::move(Err);
----------------
We should have a way to pass extra options to fatbinary, too. E.g. we may want 
to use `--compress-all`. Also, we may need to pass through `-g` for debug 
builds.

Oh. Debug builds. Makes me wonder if cuda-gdb will be able to find GPU binaries 
packaged by the new driver. If it does not, it will be a rather serious 
problem. It would likely affect various profiling tools the same way, too. Can 
you give it a try?


================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1192
     // Run LTO on any bitcode files and replace the input with the result.
-    if (Error Err = linkBitcodeFiles(LinkerInput.getSecond(), TheTriple,
-                                     File.Arch, WholeProgram))
+    if (Error Err = linkBitcodeFiles(LinkerInputFiles, TheTriple, File.Arch,
+                                     WholeProgram))
----------------
Nit. We should think of changing `linkBitcodeFiles` to return 
`llvm::ErrorOr<bool>` so we can return `WholeArchive` value, instead of 
modifying it as an argument.


================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1200
+      LinkedImages.emplace_back(OFK_OpenMP, TheTriple.getTriple(), File.Arch,
+                                LinkerInputFiles.front());
       continue;
----------------
What's expected to happen if we have more than one input?
If only one is ever expected, I'd add an assert.


================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1206
     // CUDA in non-RDC mode.
     if (WholeProgram && TheTriple.isNVPTX()) {
+      assert(!LinkerInputFiles.empty() && "No non-RDC image to embed");
----------------
Should `EmbedBitcode` be mutually exclusive vs `WholeArchive`? 

Can we ever end up with both unset? 


================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1210
+        LinkedImages.emplace_back(Kind, TheTriple.getTriple(), File.Arch,
+                                  LinkerInputFiles.front());
       continue;
----------------
ditto about the assert on the number of inputs.


================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1347
+
+    auto FileOrErr = compileModule(M);
+    if (!FileOrErr)
----------------
The `M` contains generated registration glue at this point, right? It may be 
worth a comment to explain what is it that we're compiling here.


================
Comment at: clang/tools/clang-linker-wrapper/OffloadWrapper.cpp:271
+llvm::Error wrapCudaBinary(llvm::Module &M, llvm::ArrayRef<char> Images) {
+  return Error::success();
+}
----------------
A "not-implemented-yet/TODO" comment would be appropriate here.


================
Comment at: clang/tools/clang-linker-wrapper/OffloadWrapper.h:20
+
+/// Wrap the input fatbinary image into the module \p M as global symbols and
+/// registers the images with the CUDA runtime.
----------------
It should be either "Wraps/registers" or "Wrap/register".


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123810/new/

https://reviews.llvm.org/D123810

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D123810: [Cuda] Add initial support for wrapping CUDA images in the new driver.

Reply via email to