On Fri, Nov 29, 2013 at 01:17:56PM +0100, Bernd Schmidt wrote: > We've been working on similar patches for our OpenACC project. The goal > is to have functions generated during omp-low that will ultimately > execute on a ptx target, write them out using LTO infrastructure and > read them back in using a nvptx-none lto1.
Please see the past threads about this topic, e.g. "Questions about LTO infrastructure and pragma omp target" thread on gcc ml from August till now, also "Offloading Support in libgomp" and "Target compilation for offloading" It certainly doesn't make sense to invent different infrastructures for OpenMP offloading and for OpenACC offloading, after all, the current OpenACC code on gomp-4_0-branch I think meant to use the libgomp APIs. > 4. We add a vector of target machines to the compiler. Normally this is > just initialized to the single machine for which the compiler is > configured, but when e.g. OpenACC with an accelerator is enabled, the > accelerator machine is added to that list. It should cope fine with > multiple different accelerator devices. This was discussed that it would be nice to allow users during linking (or compilation already?) to choose for which offloading targets code should be compiled, and have a mechanism to use original non-target specific options + have a way to override those for the offloading target. > 5. There's a new DECL_TARGET which refers to this list of target > machines. It's set when creating a child function from e.g. "#pragma acc > parallel" > 6. ipa_write_summaries iterates over DECL_TARGET machines to write out Right now on gomp-4_0-branch a special attribute on the decls (VAR_DECL as well as FUNCTION_DECL) is used for these, but if there are spare bits, something else could be used instead. > LTO for each of them. LTO sections for a different target get a separate > prefix encoding the machine name, e.g. ".gnu.tlto_nvptx_...". As you want to dump the GIMPLE IL right out of ~ IPA stage, it should in theory be target independent, so it is undesirable to emit it several times for each offloading target. Instead just stream once and let during linking decide what to support. Jakub