On 19.11.20 12:03, Thomas Schwinge wrote:

As far as I can tell, we (GCC) don't currently state the minimum CUDA
version supported: for nvptx target, and especially then for
OpenACC/OpenMP nvptx offloading.

The nvptx target generates ".version 3.1" (= PTX ISA version) code
with ".target sm_30" and ".target sm_35".

If one looks at
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#release-notes__ptx-release-history
that means that we need at least CUDA 5 (which introduced PTX ISA 3.1).

CUDA 5 also added sm_30/sm_35 support - i.e. the sm_30/sm_35 GPUs.

[Hence, without bumping the .version, we cannot support newer sm_XX
and I do not see us adding support for sm_20 or other older hardware.]


In theory, newer CUDA versions still support the oldest GPUs down to sm_10.

There is an (unofficial) list which maps sm_* to GPU names at
https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/

Contrary to the official CUDA documentation, that page states that
CUDA 10 dropped sm_20 support in CUDA 10.

And Tom's testing showed that CUDA 11.1 no longer supports sm_30
– at least its "ptxas" (which we use in JIT mode at program startup).
That is the reason that GCC 11 now defaults to sm_35.
(Use -misa=sm_30 if you have an older card.)

 * * *

In terms of supported features, moving to PTX ISA 6.3 (= CUDA 10)
makes most sense as that adds a lot of new features, in particular
'.alias'. – There are a few new features between after ISA 3.1 and
before 6.3 but they are not really worthwhile.

(The .alias issue mostly pops up for C++ code with constructors
or in special cases. There are some PRs about this.)

Do distribution maintainers have an opinion on this maybe, what they need
to support (for upcoming GCC 11, especially)?  (Matthias, others?)

I think for license reasons, CUDA itself is not shipped by the
distributions but I might be wrong.
In any case, the question is which CUDA versions are installed in
practice and whether requiring an update (e.g. to CUDA 10+) is
sensible and possible.

CUDA 10 was released in September 2018 – thus, it is rather new.

On 19.11.20 12:18, Jakub Jelinek via Gcc wrote:
I think it would be nice not to kill support for older devices
unnecessarily.  If there is a strong reason, sure.  And ditto for older
CUDA versions.
But I'm afraid many people have older HW and don't have anything newer.

I'm testing offloading only from time to time, but the only remaining hw
I have has 3.5 capability (Quadro K6000, GK110GL core).  And I use that for
both upstream development and distro offloading testing.

If I look at the list above, 3.5 are still supported by CUDA 11.1,
hence, CUDA 10's PTX ISA 6.3 including .alias should still work with
sm_30 (w/ CUDA < 11.1) and sm_35.

If we add support for PTX ISA 6.3 (i.e. CUDA >= 10) to use newer
PTX features (both generic and, possibly, those for newer GPUS/sm_XX),
the question is whether we still want to support older PTX ISA/CUDA
or not.

If so, we need an additional -m* flag.

In terms of userfriendliness, adding -misaversion={3.1,6.3} makes
sense – and then adding .alias support (with warning/error with version 3.1)
plus modifying the shfl vs. shfl.sync depending on the version.


IMHO:

Especially due to the .alias issue, I am very much in favor of supporting
.version 6.3 in GCC 11; about adding -misaversion I am a bit unsure, but
it probably makes sense given that CUDA 10 is still relatively new.


Continued: On 19.11.20 12:03, Thomas Schwinge wrote:

"CUDA" here primarily means libcuda (CUDA Driver API and corresponding
Nvidia Linux kernel driver).

A related topic then is which Nvidia GPU devices are supported by GCC.
Due to the generic nature of PTX code generation as well as pretty
generic hardware access via the CUDA Driver API in libgomp, this is
primarily dictated by whichever Nvidia GPU devices are supported by
libcuda, which GCC doesn't influence.

(A related topic also is PR96005 "Add possibility to use newer ptx isa",
etc., but that's for a separate discussion thread.)

Well, if we do not update the PTX ISA version, we are stuck to PTX ISA 3.1
features and the PTX for sm_30 and sm_35. – Those will work with newer
hardware but prevent supporting the newer hardware better.

Thus, I regard the newer ptx isa question as tightly coupled with the
questions raised in this thread – and do not think that discussing it
in another thread makes sense.

I can confirm that as of present, things are still working fine with
many-years-old CUDA 6.5/libcuda 346.46 (2014), with Nvidia Tesla K40c
GPU, for example.

Until recently (but not anymore), I've been running testing with a Dell
Precision M4700 laptop's Nvidia Quadro K1000M GPU (GK107 core).  If I
remember correctly, that one supports Compute Capability 3.0, which Tom
recently removed (default) support for via PR97348 "[nvptx] Make
-misa=sm_35 the default".  For avoidance of doubt: that's OK as far as
I'm concerned; not using this laptop anymore for GCC testing, and, as has
been noted, 3.5 now is the minimum version with CUDA 11 anyway.
(I wonder whether that's 11.0 or 11.1.)

(CUDA 11 still supports now, but does deprecate compute capabilities 3.5,
3.7, 5.0, by the way.)
[...](CUDA 8.0 has been released in 2016.)
[...](2014 CUDA 6.5)

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter

Reply via email to