Questions about LTO infrastructure and pragma omp target

Ilya Verbin Thu, 15 Aug 2013 06:38:01 -0700

Hi All,

I'm trying to figure out how LTO infrastructure works on a high level.
I want to make sure that I understand this correctly.  Could you please
help me with that?


1.  Execution flow.  As far as I understood, there are 2 modes of
operation - with/without LTO plugin.  Below are the execution flows
for each mode.

Without LTO plugin:

gcc -flto      # Call GCC driver
 |_ cc1        # Compile first source file into asm + intermediate language
 |_ as         # Assemble these asm + IL into temporary object file
 |_ ...        # Compile and assemble all remaining source files
 |_ collect2   # Call linker driver
     |_ lto-wrapper    # Call lto-wrapper directly from collect2
     |   |_ gcc        # Driver
     |   |   |_ lto1   # Perform WPA and split into partitions
     |   |_ gcc        # Driver
     |   |   |_ lto1   # Perform LTRANS for the first partition
     |   |   |_ as     # Assemble this partition into final object file
     |   |_ ...        # Perform LTRANS for each partition
     |_ collect-ld     # Simple wrapper over ld
         |_ ld         # Perform linking

Using LTO plugin:

gcc -flto      # Call GCC driver
 |_ cc1        # Compile first source file into asm + intermediate language
 |_ as         # Assemble these asm + IL into temporary object file
 |_ ...        # Compile and assemble all remaining source files
 |_ collect2   # Call linker driver
     |_ collect-ld   # Simple wrapper over ld
         |_ ld with liblto_plugin.so   # Perform LTO and linking
             |_ lto-wrapper    # Is called from liblto_plugin.so
                 |_ gcc        # Driver
                 |   |_ lto1   # Perform WPA and split into partitions
                 |_ gcc        # Driver
                 |   |_ lto1   # Perform LTRANS for the first partition
                 |   |_ as     # Assemble this partition into final object file
                 |_ ...        # Perform LTRANS for each partition

Are they correct?

2.  The second question, regarding #pragma omp target implementation.
I'm going to reuse LTO approach in a prototype, that will produce 2
binaries - for host and target architectures.  Target binary will contain
functions outlined from omp target region and some infrastructure to run
them.
To produce 2 binaries we need to run gcc and ld twice.  At the first run
gcc will generate object file, that contains optimized code for host and
GIMPLE for target.  At the second run gcc will read the GIMPLE and
generate optimized code for target.

So, the question is - what is the right place for the second run of gcc
and ld?  Should I insert them into liblto_plugin.so?  Or should I create
entirely new plugin, that will only call gcc and ld for target, without
performing any LTO optimizations for host?
Suggestions?

----
Thanks,
Ilya Verbin,
Software Engineer
Intel Corporation

Questions about LTO infrastructure and pragma omp target

Reply via email to