transformations vs. heterogeneous offloading compilation

tschwinge at gcc dot gnu.org via Gcc-bugs Thu, 23 Jan 2025 10:16:16 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118633


            Bug ID: 118633
           Summary: Early optimizations/transformations vs. heterogeneous
                    offloading compilation
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: openacc, openmp
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tschwinge at gcc dot gnu.org
                CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org,
                    pinskia at gcc dot gnu.org, rguenth at gcc dot gnu.org
  Target Milestone: ---

I had wondered about that issue before (years ago, but don't think a PR already
exists), and now remembered when reading PR118012 "[avr][13/14/15 Regression]
Expensive code (bit extract + extend + neg + and) instead of bit test",
PR118360 "[avr] Expensive shift instead of bit test".  These -- if I quickly
got that right -- may intend to add more target hooks use (like, costing of
multiplication vs. jump) to guide early optimizations/transformations
(GENERIC/GIMPLE 'match.pd' etc.).  I understand the motivation.

However, somewhat reciprocal to that, for heterogeneous offloading compilation
(basically, multi-target GCC), we should rather *not* do any such
optimizations/transformations before the offloading code has been split off:
what is beneficial for the host (such as x86_64, PPC64, aarch64, RISC-V) may
not necessarily be so for heterogeneous device code generation (such as nvptx,
GCN).  And worse, if for different hosts we end up with different "patterns",
it'll become more and more complex to handle (recognize, possibly un-do) these
transformations during offloading compilation.

Do we have a coherent story regarding this issue?  Can we defer "all this"
until after the offloading code has been split off?

What I would not like to do is guide early optimizations/transformations on
whether offloading compilation is enabled or not, because that may result in
different host code being generated (even if offloading is then not actually
used).  (I think PR95622 already describes one such case?)

If I remember correctly, we already have a few ad-hoc cases where
optimizations/transformations are deferred based on whether we're inside an
offloading code region.  This also has a bit of a bad taste to it?  (..., but
may be necessary in some cases, I understand.)

I had the idea to add a check for this via an internal flag whether the
offloading code has been split off, and if "non-allowed" target hooks are used
before, error out.  Does that make any sense?

[Bug middle-end/118633] New: Early optimizations/transformations vs. heterogeneous offloading compilation

Reply via email to