On 5/12/21 5:50 PM, Tobias Burnus wrote: > Hi, > > On 12.05.21 16:10, Tom de Vries wrote: >> Add nvptx option -mptx that sets the ptx ISA version. This is currently >> hardcoded to 3.1. >> Tested libgomp on x86_64-linux with nvptx accelerator, both with >> default set to >> 3.1 and 6.3. >> Any comments? > > :-) > > ISA 3.1 = CUDA 5 (supporting sm_10 to sm_{30,35} > ISA 6.3 = CUDA 10.0 (supporting sm_10 to sm_{70,72,75} > > I think it is useful – both to move to new -misa (beyond > sm_30 and sm_35) which require a newer ISA for some > features.
Yes, I expect that we want to add sm_70 to take advantage of cas.b16. > But a lot of new features (like .alias) are > generic and also very useful. > > It also permits to use the .alias feature for > PR 97102 (see attached patch). > Ack. > There is one typo in the doc: > 'The default PTX version is sm_3.1.' > There is a spurious 'sm_'. > Fixed, thanks. > Should there be a fixme/missed optimization > comment regarding the lane mask for the > .sync variants? Or is 0xffffffff fine for the > foreseeable future and the comment is not needed? > I think it's fine like this. > * * * > > The other question is how to move forward from there, > i.e. when to move requiring CUDA 10+ (6.3) by default, > permitting -mptx=3.1 only as legacy mode? > I filed today PR Bug 100565 - "[nvptx] Need configure options for misa default". So, my thinking is that once we can set -misa and -mptx defaults using configure options, changing the default in the source code should have less of an impact. Anyway, I think a reasonable way of dealing with this is to follow the latest stable CUDA release: if that stops supporting something, the default should move on to accommodate for that. > And how to test this best in the testsuite? Namely, > should we iterate through both ISA modes? Or specify > manually in some tests? Just test the default regularily? > I would probably go for some default config that works well for my hardware and driver and test that, and then once in a while test other configurations. > How to handle sm_xx which are not supported by the > default/specified -misa=sm_...? (Error out?) I think we're ok for the current matrix. I guess with sm_70 that'll be different. I'd say the solution there will be dictated by what the error mode will look like otherwise. > And when/whether to move to a higher sm_... value by default? > > (I have not checked but it seems as sm_70+ is the largest > step but sm_70 not yet widely used; hence, sticking to > sm_35 for a while is probably fine.) > sm_35 is still supported by cuda 11.3, I'm hoping the same for the next one (11.4 IIUC). Thanks, - Tom