https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118633
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- IMO we can't have both - "early" optimized offload code and optimization that's suited for the offload target. Instead we should somehow work towards "offloading" (aka outlining) relevant parts before early optimizations (that possibly affect the offload target adversely) are run on the code that at that point appears to be targeted to the host. I'll note that we for example apply the host targets inlining target hooks to decide on early inlining done. The mentioned match.pd (much like code in fold-const.cc) should be always seen as canonicalization (and partial constant folding), they are not tied to a specific target but of course people contributing usually have "common" targets in mind as well.