https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110082
Bug ID: 110082
Summary: Coverage analysis vs. offloading compilation
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: openacc, openmp, wrong-code
Severity: normal
Priority: P3
Component: gcov-profile
Assignee: unassigned at gcc dot gnu.org
Reporter: tschwinge at gcc dot gnu.org
CC: hubicka at gcc dot gnu.org, jakub at gcc dot gnu.org,
marxin at gcc dot gnu.org, rguenth at gcc dot gnu.org
Target Milestone: ---
Target: amdgcn-amdhsa, nvptx-none
(Via a customer report) we've determined that offloading compilation fails in
combination with '-fprofile-arcs' (as implied by '--coverage', coverage
analysis):
<built-in>: error: variable ‘__gcov0.main._omp_fn.0’ has been referenced in
offloaded code but hasn’t been marked to be included in the offloaded code
lto1: fatal error: errors during merging of translation units
compilation terminated.
nvptx mkoffload: fatal error:
build-gcc/gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status
[...]
Per my quick look, during early host compilation, via
'gcc/tree-profile.cc:gimple_gen_edge_profiler', via 'pass_ipa_tree_profile', as
visible in '*.069i.profile', '__atomic_fetch_add_8 (&__gcov0.main._omp_fn.0[0],
1, 0);' etc. statements are added. (Or, different statements in case that the
target cannot "utilize atomic update operations", or 'PROFILE_UPDATE_SINGLE' is
used.)
That's part of the IPA passes, so before offloading compilation split-off.
(... as per the error, evidently).
As we've got no mechanism implemented currently to move any device-side
coverage data from the device execution back to the host, and integrate it with
the host-side coverage data, we propose to not do coverage analysis for
offloading compilation.
My idea is to abstract the "increment the edge execution count" operations into
some new GIMPLE/IFN code (?), and then later, once the offloading code has been
split off, lower it to the current form (host-side), or no-op (device-side).
I'd appreciate a quick review if that approach makes sense?
---
We've seen and dealt with a few already, over the past decade, but still more
such similar issues certainly exist for other scenarios where GCC "early"
(before the offloading code split-off) does code transformations that device
compilation may choke on, and thus has to explicitly handle. A full review of
GCC "early" transformations doesn't seem feasible, so we shall continue
addressing these incrementally, as encountered.
For another example (that I shall soon be working on), see
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544#c9>.