tra added inline comments.
================ Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:1057 + +BUILTIN(__nvvm_atom_xchg_global_i, "iiD*i", "n") +TARGET_BUILTIN(__nvvm_atom_cta_xchg_global_i, "iiD*i", "n", SM_60) ---------------- t4c1 wrote: > tra wrote: > > Naghasan wrote: > > > tra wrote: > > > > t4c1 wrote: > > > > > tra wrote: > > > > > > t4c1 wrote: > > > > > > > tra wrote: > > > > > > > > We need to figure out how address-space-specific builtins are > > > > > > > > supposed to work. > > > > > > > > Right now two competing approaches. > > > > > > > > > > > > > > > > This patch declares builtins with generic pointer as an > > > > > > > > argument, but, according to the test, expects to be used with > > > > > > > > the AS-specific pointer. > > > > > > > > It probably does not catch a wrong-AS pointer passed as an > > > > > > > > argument, either. > > > > > > > > It does happen to work, but I think it's mostly due to the fact > > > > > > > > that LLVM intrinsics are overloaded and we happen to end up > > > > > > > > addrspacecast'ing things correctly if the builtin gets the > > > > > > > > right kind of pointer. > > > > > > > > > > > > > > > > The other approach is to declare the pointer with the expected > > > > > > > > AS. E.g: > > > > > > > > > TARGET_BUILTIN(__nvvm_mbarrier_init_shared, "vWi*3i", "", > > > > > > > > > AND(SM_80,PTX70)) > > > > > > > > > > > > > > > > IMO, this is the correct way to do it, but it is also rather > > > > > > > > inconvenient to use from CUDA code as it operates on generic > > > > > > > > pointers. > > > > > > > > > > > > > > > > @jdoerfert - WDYT? > > > > > > > >TARGET_BUILTIN(__nvvm_mbarrier_init_shared, "vWi*3i", "", > > > > > > > >AND(SM_80,PTX70)) > > > > > > > >IMO, this is the correct way to do it, but it is also rather > > > > > > > >inconvenient to use from CUDA code as it operates on generic > > > > > > > >pointers. > > > > > > > > > > > > > > I tried doing this, however it is also completely unusable from > > > > > > > OpenCL code (which is our use case). Trying to use it gives you > > > > > > > errors about how casting pointers to different address spaces - > > > > > > > for example from local to AS3 is not allowed. > > > > > > Hmm. It should've worked. It would need the same explicit cast to a > > > > > > pointer in AS(3) as in your tests, but it would safeguard against > > > > > > attempts to pass it a generic pointer. E.g. > > > > > > https://godbolt.org/z/qE6oxzheM > > > > > > > > > > > > Explicit casting to AS(X) for AS-specific variants is annoying, but > > > > > > it's probably unavoidable at the moment regardless of whether we > > > > > > declare the pointer argument to be AS-specific or not. LLVM will > > > > > > not always be able to infer that a pointer is in particular AS. > > > > > > Using specific AS in the declaration has a minor benefit of > > > > > > safeguarding at compile time against unintentional use of generic > > > > > > pointers. > > > > > > > > > > > > Ideally we may want to convert generic variant of the builtin to > > > > > > AS-specific one, if LLVM does know the AS. We currently do this for > > > > > > loads/stores, but not for other instructions. > > > > > > > > > > > Well, it does not work. See: https://godbolt.org/z/vM6bKncc4. > > > > > Declaring the pointer to be in generic AS is the only way I got it to > > > > > work. If you know a way to call a builtin declared with AS numbers in > > > > > OpenCL, I am happy to do that instead. > > > > Could you help me understand how different address spaces are handled > > > > by OpenCL in clang and LLVM? > > > > What exactly are `__local` or `__private` in OpenCL? Do they map to > > > > LLVM's address spaces? > > > > Clang does complain that we're trying to change AS, but I do not see AS > > > > used in the IR: https://godbolt.org/z/soajoE991 > > > > > > > > AFAICT what happens is: > > > > * OpenCL does use explicit AS for pointers (`__local`, `__private` seem > > > > to pop up in the error messages) > > > > * `__attribute__((address_space(3)))` does not match the AS of > > > > `__local` and that leads to an error. > > > > * generic pointer argument works because we are allowed to cast any > > > > specific AS to generic one. > > > > > > > > In other words, having specific-AS arguemt works as intended, we just > > > > have a mismatch between AS number used by OpenCL and AS number used by > > > > NVPTX and we're not allowed to cast between two specific AS. > > > > > > > > If that's indeed the case, then what we appear to be missing is the > > > > mapping from OpenCL AS to the target-specific AS, which should, > > > > ideally, be done by clang itself. So, if NVPTX-specific builtin > > > > operating on shared pointer is called with a pointer in OpenCL's > > > > equivalent of CUDA's `__shared__`, it should be mapped to AS(3). > > > > > > > > Could you help me understand how different address spaces are handled > > > > by OpenCL in clang and LLVM? > > > > > > clang makes a strong distinction between language and target address > > > spaces since ~3.6 (was more loose before). Each target in clang defines a > > > map between language and target address space (e.g. > > > https://github.com/llvm/llvm-project/blob/main/clang/lib/Basic/Targets/SPIR.h#L25). > > > The map is used in clang codegen to lower `__local` to the right target > > > address space. > > > > > > > Clang does complain that we're trying to change AS, but I do not see AS > > > > used in the IR: https://godbolt.org/z/soajoE991 > > > > > > Same code but targeting SPIR: https://godbolt.org/z/4E3brzodq (ARM maps > > > all languages address spaces to 0) > > > > > > > having specific-AS arguemt works as intended, we just have a mismatch > > > > between AS number used by OpenCL and AS number used by NVPTX and we're > > > > not allowed to cast between two specific AS > > > > > > OpenCL languages do map to nvptx target address space, it is just form > > > clang's semantic a language asp is strongly distinct from target ones > > > (lang asp simply isn't understood as a number). So you can't mix and > > > match without an explicit cast, even if the lang asp eventually lowers to > > > that target. > > > > > > What is missing is the ability for the target builtin to accept a > > > language asp if it lowers to the target asp. > > > > > What is missing is the ability for the target builtin to accept a > > > language asp if it lowers to the target asp. > > > > Agreed. I think this should be a sensible thing to support. > > Once it's fixed you should be able to both use AS-specific builtins w/o > > explicit pointer casts and have error checks for the wrong pointer types. > > > Ok, I will look into that, but it might take some time. Thank you for looking into this. These patches will help a lot to fill in the gap we currently have in support for atomics on GPU. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112718/new/ https://reviews.llvm.org/D112718 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits