tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.

LGTM for CUDA.

@yaxunl Sam, does the change make sense for HIP?



================
Comment at: clang/test/Driver/cuda-options.cu:190-192
+// RUN: | FileCheck -check-prefix DEVICE -check-prefix DEVICE-NOSAVE \
+// RUN:    -check-prefix HOST -check-prefix INCLUDES-DEVICE \
+// RUN:    -check-prefix NOLINK -check-prefix THINLTOWPD %s
----------------
Nit: This could be combined into `--check-prefixes=DEVICE,DEVICE-NOSAVE,....`


================
Comment at: clang/test/Driver/hip-options.hip:63
 // RUN: %clang -### -target x86_64-unknown-linux-gnu -nogpuinc -nogpulib \
-// RUN:   --cuda-gpu-arch=gfx906 -foffload-lto=thin %s 2>&1 \
-// RUN:   | FileCheck -check-prefix=THINLTO %s
+// RUN:   --cuda-gpu-arch=gfx906 -foffload-lto=thin -fwhole-program-vtables %s 
2>&1 \
+// RUN:   | FileCheck -check-prefix=HIPTHINLTO %s
----------------
tejohnson wrote:
> tra wrote:
> > tejohnson wrote:
> > > tra wrote:
> > > > > caused a regression when compiling cuda code with -flto=thin 
> > > > > -fwhole-program-vtables.
> > > > 
> > > > We should add a CUDA test for that. This test only covers HIP 
> > > > compilation.
> > > AFAICT there are no existing Cuda lto tests in clang/test/Driver that I 
> > > could add -fwhole-program-vtables to.
> > > 
> > > However, for the purposes of this bug fix, I think adding the below 
> > > testing here should be sufficient - it triggers exactly the same way as 
> > > when I saw this in an internal cuda build.
> > > AFAICT there are no existing Cuda lto tests in clang/test/Driver that I 
> > > could add -fwhole-program-vtables to.
> > 
> > `cuda-options.cu` would be the right place.
> > 
> > > However, for the purposes of this bug fix, I think adding the below 
> > > testing here should be sufficient - it triggers exactly the same way as 
> > > when I saw this in an internal cuda build.
> > 
> > The key difference is that HIP does support LTO on the GPU side, but CUDA's 
> > does not, which suggests that their handling of lto-related flags is likely 
> > different and worth testing that we do see the expected behavior. E.g. HIP 
> > compilation with `-foffload-lto=thin` does propagate LTO flags to device 
> > compilation, but CUDA compilation should not (but should still compile with 
> > LTO enabled on the host.
> > 
> > While this patch does fix the issue for both CUDA and HIP, it's good to 
> > have a test which demonstrates how we expect HIP/CUDA to behave. Right now 
> > we only have that for HIP.
> > 
> Added the cuda test, and confirmed it fails with the error without my patch.
Thank you.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103579/new/

https://reviews.llvm.org/D103579

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to