Hi Thomas, I apologise that it might complicate things, but one potential benefit of --with-cuda-driver (i.e. linking the compiler against proprietary libraries) is that it would allow support for -march=native on nvptx (i.e. the gcc driver can figure out what sm_xx is available on the GPU(s) of the current machine, and pass that to cc1. Like with all the microarchitectures on other platforms (x86_64), figuring this out is not a trivial task for many end-users.
Of course, ideally I'd love to be able to figure out the PTX hardware specifications and driver versions without using a third-party library, but I've no idea how this could be done (portably across the platforms that support libcuda). Perhaps dlopen at runtime? Or calling out to (executing) nvptx-tools? Cheers, Roger -- > -----Original Message----- > From: Thomas Schwinge <tho...@codesourcery.com> > Sent: 05 April 2022 16:14 > To: Tom de Vries <tdevr...@suse.de>; Jakub Jelinek <ja...@redhat.com> > Cc: gcc-patches@gcc.gnu.org; Tobias Burnus <tob...@codesourcery.com>; > Roger Sayle <ro...@nextmovesoftware.com> > Subject: Proposal to remove '--with-cuda-driver' (was: [wwwdocs][patch] gcc- > 12: Nvptx updates) > > Hi! > > Still catching up with GCC/nvptx back end changes... %-) > > > In the following I'm not discussing the patch to document > "gcc-12: Nvptx updates", but rather one aspect of the > "gcc-12: Nvptx updates" themselves. ;-) > > On 2022-03-30T14:27:41+0200, Tom de Vries <tdevr...@suse.de> wrote: > > + <li>The <code>-march</code> flag has been added. The <code>- > misa</code> > > + flag is now considered an alias of the <code>-march</code> > > + flag.</li> <li>Support for PTX ISA target architectures > > <code>sm_53</code>, > > + <code>sm_70</code>, <code>sm_75</code> and <code>sm_80</code> > has been > > + added. These can be specified using the <code>-march</code> > > + flag.</li> <li>The default PTX ISA target architecture has been set back > > + to <code>sm_30</code>, to fix support for <code>sm_30</code> > > + boards.</li> <li>The <code>-march-map</code> flag has been added. The > > + <code>-march-map</code> value will be mapped to an valid > > + <code>-march</code> flag value. For instance, > > + <code>-march-map=sm_50</code> maps to <code>- > march=sm_35</code>. > > + This can be used to specify that generated code is to be executed on a > > + board with at least some specific compute capability, without having to > > + know the valid values for the <code>-march</code> flag.</li> > > Regarding the following: > > > <li>The <code>-mptx</code> flag has been added to specify the PTX ISA > version > > for the generated code; permitted values are <code>3.1</code> > > - (default, matches previous GCC versions) and <code>6.3</code>. > > + (matches previous GCC versions), <code>6.0</code>, <code>6.3</code>, > > + and <code>7.0</code>. If not specified, the used version is the > > minimal > > + version required for <code>-march</code> but at least > <code>6.0</code>. > > </li> > > For "the PTX ISA version [used is] at least '6.0'", per > <https://docs.nvidia.com/cuda/parallel-thread-execution/#release-notes>, > this means we now require "CUDA 9.0, driver r384" (or more recent). > Per <https://developer.nvidia.com/cuda-toolkit-archive>: > "CUDA Toolkit 9.0 (Sept 2017)", so ~4.5 years old. > Per <https://download.nvidia.com/XFree86/Linux-x86_64/>, I'm guessing a > similar timeframe for the imprecise "r384" Driver version stated in that > table. > That should all be fine (re not mandating use of all-too-recent versions). > > Now, consider doing a GCC/nvptx offloading build with '--with-cuda-driver' > pointing to CUDA 9.0 (or more recent). This means that the libgomp nvptx > plugin may now use CUDA Driver features of the CUDA 9.0 distribution ("driver > r384", etc.) -- because that's what it is being 'configure'd and linked > against. (I > say "may now use", because we're currently not making a lot of effort to use > "modern" CUDA Driver features -- but we could, and probably should. That's a > separate discussion, of course.) It then follows that the libgomp nvptx > plugin > has a hard dependency on CUDA Driver features of the CUDA 9.0 distribution > ("driver r384", etc.). That's dependency as in ABI: via '*.so' symbol > versions as > well as internal CUDA interface configuration; see <cuda.h> doing different > '#define's for different '__CUDA_API_VERSION' etc.) > > Now assume one such dependency on "modern" CUDA Driver were not > implemented by: > > > + <li>An <code>mptx-3.1</code> multilib was added. This allows using older > > + drivers which do not support PTX ISA version 6.0.</li> > > ... this "old" CUDA Driver. Then you do have the '-mptx-3.1' multilib to use > with > "old" CUDA Driver -- but you cannot actually use the libgomp nvptx plugin, > because that's been built against "modern" CUDA Driver. > > Same problem, generally, for 'nvptx-run' of the nvptx-tools, which has similar > CUDA Driver dependencies. > > Now, that may currently be a latent problem only, because we're not actually > making use of "modern" CUDA Driver features. But, I'd like to resolve this > "impedance mismatch", before we actually run into such problems. > > Already long ago Jakub put in changes to use '--without-cuda-driver' to "Allow > building GCC with PTX offloading even without CUDA being installed (gcc and > nvptx-tools patches)": "Especially for distributions it is undesirable to > need to > have proprietary CUDA libraries and headers installed when building GCC.", > and I > understand GNU/Linux distributions all use that. That configuration uses the > GCC-provided 'libgomp/plugin/cuda/cuda.h', 'libgomp/plugin/cuda-lib.def' to > manually define the CUDA Driver ABI to use, and then 'dlopen("libcuda.so.1")'. > (Similar to what the libgomp GCN (and before: HSA) plugin is doing, for > example.) Quite likely that our group (at work) are the only ones to > actually use > '--with-cuda-driver'? > > My proposal now is: we remove '--with-cuda-driver' (make its use a no-op, per > standard GNU Autoconf behavior), and offer '--without-cuda-driver' > only. This shouldn't cause any user-visible change in behavior, so safe > without a > prior deprecation phase. > > Before I prepare the patches (GCC, nvptx-tools): any comments or objections? > > > Grüße > Thomas > > > > <li>The new <code>__PTX_SM__</code> predefined macro allows code to > check the > > - compute model being targeted by the compiler.</li> > > + PTX ISA target architecture being targeted by the > > + compiler.</li> <li>The new <code>__PTX_ISA_VERSION_MAJOR__</code> > > + and <code>__PTX_ISA_VERSION_MINOR__</code> predefined macros > allows code > > + to check the PTX ISA version being targeted by the > > + compiler.</li> > > </ul> > ----------------- > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, > 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: > Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht > München, HRB 106955