On 09/05/2018 12:19 AM, Cesar Philippidis wrote: > On 09/02/2018 07:57 AM, Cesar Philippidis wrote: >> On 09/01/2018 12:04 PM, Tom de Vries wrote: >>> On 08/31/2018 04:14 PM, Cesar Philippidis wrote: >> >>>> Is this patch OK for trunk? >>>> >>> >>> Well, how did you test this ( >>> https://gcc.gnu.org/contribute.html#patches : "Bootstrapping and >>> testing. State the host and target combinations you used to do proper >>> testing as described above, and the results of your testing.") ? >> >> I tested the standalone nvptx compiler. I'll retest with libgomp with >> -misa=sm_35. Bootstrapping won't help much here, unfortunately. >>>> +++ b/gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c >>>> @@ -0,0 +1,24 @@ >>>> +/* Test the nvptx atomic instructions for __atomic_fetch_OP for SM_35 >>>> + targets. */ >>>> + >>>> +/* { dg-do compile } */ >>>> +/* { dg-options "-O2 -misa=sm_35" } */ >>>> + >>>> +int >>>> +main() >>>> +{ >>>> + unsigned long long a = ~0; >>>> + unsigned b = 0xa; >>>> + >>>> + __atomic_fetch_add (&a, b, 0); >>>> + __atomic_fetch_and (&a, b, 0); >>>> + __atomic_fetch_or (&a, b, 0); >>>> + __atomic_fetch_xor (&a, b, 0); >>>> + >>>> + return a; >>>> +} >>>> + >>>> +/* { dg-final { scan-assembler "atom.add.u64" } } */ >>>> +/* { dg-final { scan-assembler "atom.b64.and" } } */ >>>> +/* { dg-final { scan-assembler "atom.b64.or" } } */ >>>> +/* { dg-final { scan-assembler "atom.b64.xor" } } */ >>>> -- 2.17.1 >>>> >>> >>> Hmm, the add.u64 vs b64.and looks odd (and the scan-assembler-not >>> testcase does not use this difference, so that needs to be fixed, or for >>> bonus points, changed into a scan-assembler testcase). >>> >>> The documentation uses "op.type", we should fix the compiler to emit >>> that consistently. Separate patch that fixes that pre-approved. >> >> ACK. I think there are a lot of other cases like that in the BE. >> >>> This is ok (with, as I mentioned above, the SI part split off into a >>> separate patch), on the condition that you test libgomp with >>> -foffload=-misa=sm_35. > > Adding -foffload=misa=sm_35 didn't work because the host gcc doesn't > support the -misa flag.
That doesn't make sense to me. For me this works without any problems. Have you tried a clean build? > When I forced the nvptx BE to set TARGET_SM35 to > always be true, I ran into problems with SM_30 code linking against > SM_35 code. I also cannot reproduce this, works for me. > Therefore, I don't think this patch is ready for trunk yet. >> By the way, is -misa really necessary for atomic_fetch_<logic><mode>? > Looking at the PTX documentation I see > <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#changes-in-ptx-isa-version-3-1>: > > PTX ISA version 3.1 introduces the following new features: > > * Support for sm_35 target architecture. > * Extends atomic and reduction instructions to perform 64-bit {and, or, > xor} operations, and 64-bit integer {min, max} operations. > > Is there a table for which list which GPUs are compatible with which > instructions? Yes, every instruction has a table in the ptx manual, and there's a "PTX ISA Notes" entry. For the atom instruction in ptx isa 3.1 manual, we have "PTX ISA Notes": ... atom.global requires sm_11 or higher. atom.shared requires sm_12 or higher. 64-bit atom.global.{add,cas,exch} require sm_12 or higher. 64-bit atom.shared.{add,cas,exch} require sm_20 or higher. 64-bit atom.{and,or.xor,min,max} require sm_35 or higher. atom.add.f32 requires sm_20 or higher. Use of generic addressing requires sm_20 or higher. ... Thanks, - Tom