Hahnfeld created this revision. Hahnfeld added reviewers: tra, jlebar, rsmith. Herald added a subscriber: cfe-commits. Hahnfeld added a dependency: D42922: [CUDA] Register relocatable GPU binaries.
- Finding installations via ptxas binary - Relocatable device code Repository: rC Clang https://reviews.llvm.org/D45449 Files: docs/ReleaseNotes.rst include/clang/Driver/Options.td Index: include/clang/Driver/Options.td =================================================================== --- include/clang/Driver/Options.td +++ include/clang/Driver/Options.td @@ -569,7 +569,7 @@ def fcuda_approx_transcendentals : Flag<["-"], "fcuda-approx-transcendentals">, Flags<[CC1Option]>, HelpText<"Use approximate transcendental functions">; def fno_cuda_approx_transcendentals : Flag<["-"], "fno-cuda-approx-transcendentals">; -def fcuda_rdc : Flag<["-"], "fcuda-rdc">, Flags<[CC1Option, HelpHidden]>, +def fcuda_rdc : Flag<["-"], "fcuda-rdc">, Flags<[CC1Option]>, HelpText<"Generate relocatable device code, also known as separate compilation mode.">; def fno_cuda_rdc : Flag<["-"], "fno-cuda-rdc">; def dA : Flag<["-"], "dA">, Group<d_Group>; Index: docs/ReleaseNotes.rst =================================================================== --- docs/ReleaseNotes.rst +++ docs/ReleaseNotes.rst @@ -163,6 +163,18 @@ - ... +CUDA Support in Clang +--------------------- + +- Clang will now try to locate the CUDA installation next to :program:`ptxas` + in the `PATH` environment variable. This behavior can be turned off by passing + the new flag `--cuda-path-ignore-env`. + +- Clang now supports generating object files with relocatable device code. This + feature needs to be enabled with `-fcuda-rdc` and my result in performance + penalties compared to whole program compilation. Please note that NVIDIA's + :program:`nvcc` must be used for linking. + Internal API Changes --------------------
Index: include/clang/Driver/Options.td =================================================================== --- include/clang/Driver/Options.td +++ include/clang/Driver/Options.td @@ -569,7 +569,7 @@ def fcuda_approx_transcendentals : Flag<["-"], "fcuda-approx-transcendentals">, Flags<[CC1Option]>, HelpText<"Use approximate transcendental functions">; def fno_cuda_approx_transcendentals : Flag<["-"], "fno-cuda-approx-transcendentals">; -def fcuda_rdc : Flag<["-"], "fcuda-rdc">, Flags<[CC1Option, HelpHidden]>, +def fcuda_rdc : Flag<["-"], "fcuda-rdc">, Flags<[CC1Option]>, HelpText<"Generate relocatable device code, also known as separate compilation mode.">; def fno_cuda_rdc : Flag<["-"], "fno-cuda-rdc">; def dA : Flag<["-"], "dA">, Group<d_Group>; Index: docs/ReleaseNotes.rst =================================================================== --- docs/ReleaseNotes.rst +++ docs/ReleaseNotes.rst @@ -163,6 +163,18 @@ - ... +CUDA Support in Clang +--------------------- + +- Clang will now try to locate the CUDA installation next to :program:`ptxas` + in the `PATH` environment variable. This behavior can be turned off by passing + the new flag `--cuda-path-ignore-env`. + +- Clang now supports generating object files with relocatable device code. This + feature needs to be enabled with `-fcuda-rdc` and my result in performance + penalties compared to whole program compilation. Please note that NVIDIA's + :program:`nvcc` must be used for linking. + Internal API Changes --------------------
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits