On Wed, Sep 14, 2022 at 11:32:11AM -0600, Sandra Loosemore wrote: > This patch is part of the ongoing effort to find more SIMD optimization > opportunities in OpenMP code. Here we are looking for functions that have > the "omp declare target" attribute that are also suitable candidates for > automatic SIMD cloning. I've made the filter quite conservative, but maybe > it could be improved with some further analysis. I added a command-line > flag to disable this in case it is buggy :-P or leads to excessive code > bloat without improving performance in some cases, otherwise the SIMD clones > are generated in the same way and at the same optimization levels as the > existing simdclone pass. > > I had to modify the TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN hook to > add a boolean argument to control diagnostics, since GCC shouldn't complain > about types the target doesn't support in cases where the user didn't > explicitly ask for clones to be created. I tested on > x86_64-linux-gnu-amdgcn, plain x86_64-linux-gnu, and aarch64-linux-gnu to > get coverage of all 3 backends that implement this hook. OK for mainline?
declare simd is an ABI relevant declarative directive, while declare target is not, all the latter does is say whether the function should be also (or only) compilable on an offloading target. Creating simd clones under some option for random declare target functions (note, declare target is partly auto-discovered property) is perhaps fine for functions not exported from the translation unit where it is purely an optimization, but otherwise it is a significant ABI problem, you export whole new bunch of new exports on the definition side and expect those to be exported on the use side. If you compile one TU with -fopenmp-target-simd-clone and another one without it, program might not link anymore. And worse, as it is decided based on the exact implementation of the function, I assume you can't do that automatically for functions not defined locally, but whether something has simd clones or not might change over time based on how you change the implementation. Say libfoo.so exports a declare target function foo, which is initially implemented without say using inline asm (or calling one of the "bad" functions or using exceptions etc.), but then a bugfix comes and needs to use inline asm or something else in the implementation. Previously libfoo.so would export the simd clones, but now it doesn't, so the ABI of the library changes. If it is pure optimization thing and purely keyed on the definition, all the simd clones should be local to the TU, never exported from it. Jakub