https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118633
Bug ID: 118633 Summary: Early optimizations/transformations vs. heterogeneous offloading compilation Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: openacc, openmp Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tschwinge at gcc dot gnu.org CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org, pinskia at gcc dot gnu.org, rguenth at gcc dot gnu.org Target Milestone: --- I had wondered about that issue before (years ago, but don't think a PR already exists), and now remembered when reading PR118012 "[avr][13/14/15 Regression] Expensive code (bit extract + extend + neg + and) instead of bit test", PR118360 "[avr] Expensive shift instead of bit test". These -- if I quickly got that right -- may intend to add more target hooks use (like, costing of multiplication vs. jump) to guide early optimizations/transformations (GENERIC/GIMPLE 'match.pd' etc.). I understand the motivation. However, somewhat reciprocal to that, for heterogeneous offloading compilation (basically, multi-target GCC), we should rather *not* do any such optimizations/transformations before the offloading code has been split off: what is beneficial for the host (such as x86_64, PPC64, aarch64, RISC-V) may not necessarily be so for heterogeneous device code generation (such as nvptx, GCN). And worse, if for different hosts we end up with different "patterns", it'll become more and more complex to handle (recognize, possibly un-do) these transformations during offloading compilation. Do we have a coherent story regarding this issue? Can we defer "all this" until after the offloading code has been split off? What I would not like to do is guide early optimizations/transformations on whether offloading compilation is enabled or not, because that may result in different host code being generated (even if offloading is then not actually used). (I think PR95622 already describes one such case?) If I remember correctly, we already have a few ad-hoc cases where optimizations/transformations are deferred based on whether we're inside an offloading code region. This also has a bit of a bad taste to it? (..., but may be necessary in some cases, I understand.) I had the idea to add a check for this via an internal flag whether the offloading code has been split off, and if "non-allowed" target hooks are used before, error out. Does that make any sense?