Hi All, I'm trying to figure out how LTO infrastructure works on a high level. I want to make sure that I understand this correctly. Could you please help me with that?
1. Execution flow. As far as I understood, there are 2 modes of operation - with/without LTO plugin. Below are the execution flows for each mode. Without LTO plugin: gcc -flto # Call GCC driver |_ cc1 # Compile first source file into asm + intermediate language |_ as # Assemble these asm + IL into temporary object file |_ ... # Compile and assemble all remaining source files |_ collect2 # Call linker driver |_ lto-wrapper # Call lto-wrapper directly from collect2 | |_ gcc # Driver | | |_ lto1 # Perform WPA and split into partitions | |_ gcc # Driver | | |_ lto1 # Perform LTRANS for the first partition | | |_ as # Assemble this partition into final object file | |_ ... # Perform LTRANS for each partition |_ collect-ld # Simple wrapper over ld |_ ld # Perform linking Using LTO plugin: gcc -flto # Call GCC driver |_ cc1 # Compile first source file into asm + intermediate language |_ as # Assemble these asm + IL into temporary object file |_ ... # Compile and assemble all remaining source files |_ collect2 # Call linker driver |_ collect-ld # Simple wrapper over ld |_ ld with liblto_plugin.so # Perform LTO and linking |_ lto-wrapper # Is called from liblto_plugin.so |_ gcc # Driver | |_ lto1 # Perform WPA and split into partitions |_ gcc # Driver | |_ lto1 # Perform LTRANS for the first partition | |_ as # Assemble this partition into final object file |_ ... # Perform LTRANS for each partition Are they correct? 2. The second question, regarding #pragma omp target implementation. I'm going to reuse LTO approach in a prototype, that will produce 2 binaries - for host and target architectures. Target binary will contain functions outlined from omp target region and some infrastructure to run them. To produce 2 binaries we need to run gcc and ld twice. At the first run gcc will generate object file, that contains optimized code for host and GIMPLE for target. At the second run gcc will read the GIMPLE and generate optimized code for target. So, the question is - what is the right place for the second run of gcc and ld? Should I insert them into liblto_plugin.so? Or should I create entirely new plugin, that will only call gcc and ld for target, without performing any LTO optimizations for host? Suggestions? ---- Thanks, Ilya Verbin, Software Engineer Intel Corporation