jhuber6 wrote:

> > I'm assuming you're talking about GPU-side constructors? I don't think the 
> > CUDA runtime supports those, but OpenMP runs them when the image is loaded, 
> > so it would handle both independantly.
> 
> Yes. I'm thinking of the expectations from a C++ user standpoint, and this is 
> one of the areas where there will be observable differences. First, because 
> there will be subsets of the code that are no longer part of the main 
> GPU-side executable. Second, the side effects of the initializers will be 
> different depending on whether we do link such subsets separately or not. 
> E.g. the initializer call order will change. The global state changes in one 
> subset will not be visible in the other. Weak symbol resolution will produce 
> different results. Etc.

It'll definitely have an effect different from full linking, but the idea is 
that it would be the desired effect if someone went out of their way to do this 
GPU subset linking thing.
> 
> > The idea is that users already get C++-like behavior with the new driver 
> > and -fgpu-rdc generally
> 
> Yes. That will set the default expectations that things work just like in 
> C++, which is a great thing. But introduction of partial subset linking will 
> break some of those "just works" assumptions and it may be triggered by the 
> parts of the build outside of user's control (e.g. by a third-party library).

This was one of the things I was wondering about, since we could alternatively 
make a new flag for this outside of `-r` so it's explicit. Right now I just 
kind of assumed that passing `-r` through the offloading toolchain (via CUDA or 
whatever) was somewhat explicit enough, as if regular `-r` behaviour is desired 
they could just use `clang` or `ld` normally.

> 
> Side note: we do need a good term for this kind of subset linking. "partial 
> linking" already has established meaning and it's not a good fit here as we 
> actually produce a fully linked GPU executable.
> 

Yeah, coming up with a name is difficult. You could just call it device 
linking, since it's more or less just doing the device link step ahead of time 
instead of passing it to when we make the final executable.

> We do need to document how it works. Documenting what does not work, or works 
> differently is also important, IMO. We _do_ need to worry about users and 
> their expectations.

Yes, I should probably update this with some documentation. I'm not sure where 
it would go however, maybe just in the `clang-linker-wrapper`'s page.

https://github.com/llvm/llvm-project/pull/80066
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to