On 19.11.20 12:03, Thomas Schwinge wrote:
As far as I can tell, we (GCC) don't currently state the minimum CUDA version supported: for nvptx target, and especially then for OpenACC/OpenMP nvptx offloading.
The nvptx target generates ".version 3.1" (= PTX ISA version) code with ".target sm_30" and ".target sm_35". If one looks at https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#release-notes__ptx-release-history that means that we need at least CUDA 5 (which introduced PTX ISA 3.1). CUDA 5 also added sm_30/sm_35 support - i.e. the sm_30/sm_35 GPUs. [Hence, without bumping the .version, we cannot support newer sm_XX and I do not see us adding support for sm_20 or other older hardware.] In theory, newer CUDA versions still support the oldest GPUs down to sm_10. There is an (unofficial) list which maps sm_* to GPU names at https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/ Contrary to the official CUDA documentation, that page states that CUDA 10 dropped sm_20 support in CUDA 10. And Tom's testing showed that CUDA 11.1 no longer supports sm_30 – at least its "ptxas" (which we use in JIT mode at program startup). That is the reason that GCC 11 now defaults to sm_35. (Use -misa=sm_30 if you have an older card.) * * * In terms of supported features, moving to PTX ISA 6.3 (= CUDA 10) makes most sense as that adds a lot of new features, in particular '.alias'. – There are a few new features between after ISA 3.1 and before 6.3 but they are not really worthwhile. (The .alias issue mostly pops up for C++ code with constructors or in special cases. There are some PRs about this.)
Do distribution maintainers have an opinion on this maybe, what they need to support (for upcoming GCC 11, especially)? (Matthias, others?)
I think for license reasons, CUDA itself is not shipped by the distributions but I might be wrong. In any case, the question is which CUDA versions are installed in practice and whether requiring an update (e.g. to CUDA 10+) is sensible and possible. CUDA 10 was released in September 2018 – thus, it is rather new. On 19.11.20 12:18, Jakub Jelinek via Gcc wrote:
I think it would be nice not to kill support for older devices unnecessarily. If there is a strong reason, sure. And ditto for older CUDA versions. But I'm afraid many people have older HW and don't have anything newer. I'm testing offloading only from time to time, but the only remaining hw I have has 3.5 capability (Quadro K6000, GK110GL core). And I use that for both upstream development and distro offloading testing.
If I look at the list above, 3.5 are still supported by CUDA 11.1, hence, CUDA 10's PTX ISA 6.3 including .alias should still work with sm_30 (w/ CUDA < 11.1) and sm_35. If we add support for PTX ISA 6.3 (i.e. CUDA >= 10) to use newer PTX features (both generic and, possibly, those for newer GPUS/sm_XX), the question is whether we still want to support older PTX ISA/CUDA or not. If so, we need an additional -m* flag. In terms of userfriendliness, adding -misaversion={3.1,6.3} makes sense – and then adding .alias support (with warning/error with version 3.1) plus modifying the shfl vs. shfl.sync depending on the version. IMHO: Especially due to the .alias issue, I am very much in favor of supporting .version 6.3 in GCC 11; about adding -misaversion I am a bit unsure, but it probably makes sense given that CUDA 10 is still relatively new. Continued: On 19.11.20 12:03, Thomas Schwinge wrote:
"CUDA" here primarily means libcuda (CUDA Driver API and corresponding Nvidia Linux kernel driver). A related topic then is which Nvidia GPU devices are supported by GCC. Due to the generic nature of PTX code generation as well as pretty generic hardware access via the CUDA Driver API in libgomp, this is primarily dictated by whichever Nvidia GPU devices are supported by libcuda, which GCC doesn't influence. (A related topic also is PR96005 "Add possibility to use newer ptx isa", etc., but that's for a separate discussion thread.)
Well, if we do not update the PTX ISA version, we are stuck to PTX ISA 3.1 features and the PTX for sm_30 and sm_35. – Those will work with newer hardware but prevent supporting the newer hardware better. Thus, I regard the newer ptx isa question as tightly coupled with the questions raised in this thread – and do not think that discussing it in another thread makes sense.
I can confirm that as of present, things are still working fine with many-years-old CUDA 6.5/libcuda 346.46 (2014), with Nvidia Tesla K40c GPU, for example. Until recently (but not anymore), I've been running testing with a Dell Precision M4700 laptop's Nvidia Quadro K1000M GPU (GK107 core). If I remember correctly, that one supports Compute Capability 3.0, which Tom recently removed (default) support for via PR97348 "[nvptx] Make -misa=sm_35 the default". For avoidance of doubt: that's OK as far as I'm concerned; not using this laptop anymore for GCC testing, and, as has been noted, 3.5 now is the minimum version with CUDA 11 anyway.
(I wonder whether that's 11.0 or 11.1.)
(CUDA 11 still supports now, but does deprecate compute capabilities 3.5, 3.7, 5.0, by the way.) [...](CUDA 8.0 has been released in 2016.) [...](2014 CUDA 6.5)
Tobias ----------------- Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter