On Thu, Dec 02, 2021 at 12:36:30PM +0000, Andrew Stubbs wrote:
> On 30/11/2021 16:54, Jakub Jelinek wrote:
> > > Why does the GCN plugin or runtime need to know those vars?
> > > It needs to know the single array that contains their addresses of 
> > > course...
> 
> With older LLVM there were issues with relocations that made it impossible
> to link the the offload_var_table. This is why mkoffload deletes it. I've
> not tried it again recently, so it's possible we could completely rework the
> way these are processed in the plugin, but that's the hard option.
> 
> What it currently does is a symbol lookup for each named variable listed in
> the C wrapper used to embed the kernel. The lookup provided by the AMD
> runtime ignores symbols that are not exported, even if they are present in
> the ELF.

Would be nice to know what the relocation issue is or was, offload_var_table
shouldn't be different from other arrays containing pointers to static vars,
no?
If you delete it and have to do the lookups in the plugin, I understand that
then they need to be public...

> The plugin loads each image file as an independent "executable". If there
> are multiple images then there *will be* duplicate symbols (e.g.
> "init_array") but this is not a problem because they're in a different
> context.
> 
> If there's a problem with duplicate symbols *within* a given image then we
> have a bigger problem because offload_var_table is referring to them by
> name. As you say, I presume the LTO stream-in is fixing up such conflicts.

Ah, ok.

> I've tried modifying offload_handle_link_vars but that spot doesn't catch
> the omp_data_sizes variables emitted by libgomp.c-c++-common/target_42.c,
> which was one of the motivating examples.

Why doesn't catch it?  Is the variable created only post-IPA?
I'd think that it should have been created before IPA, streamed and
therefore I don't understand why you don't see it after streaming LTO in.

> It is true that my current placement visits all the symbols for every
> function, meaning that they are adjusted in an earlier iteration of a pass
> than you might expect. I couldn't find a single place that fixed this
> problem only in the amdgcn compiler and wasn't too late.
> 
> Do you have a suggestion how to not do this for other GPU targets? We can
> add another hook or macro, of course ....

Certainly a target hook.  But I'd really like to understand why you don't
see those earlier.

        Jakub

Reply via email to