On 9/1/20 1:41 PM, Tobias Burnus wrote: > Hi Tom, hello all, > > it turned out that the testcase fails on PowerPC (but not x86_64) > as the nvptx lto complains: unresolved symbol > __sync_val_compare_and_swap_16 > > The testcase uses int128 – and that's the culprit, but I have no idea > why it only fails with PowerPC and not with x86-64. >
Well, I'm guessing the explanation is here in omp-expand.c: ... /* Expand an GIMPLE_OMP_ATOMIC statement. We try to expand using expand_omp_atomic_fetch_op. If it failed, we try to call expand_omp_atomic_pipeline, and if it fails too, the ultimate fallback is wrapping the operation in a mutex (expand_omp_atomic_mutex). REGION is the atomic region built by build_omp_regions_1(). */ static void expand_omp_atomic (struct omp_region *region) ... In the x86_64 case, when doing: ... $ gcc-11 src/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c -fdump-tree-all-details -fopenmp ... we get: ... <bb 33> : D.3382 = .omp_data_i->res; GOMP_atomic_start (); D.3383 = MEM[(__int128 * {ref-all})D.3382]; <bb 34> : D.3384 = (_Bool) D.3383; if (D.3384 != 0) goto <bb 35>; [INV] else goto <bb 37>; [INV] <bb 38> : MEM[(__int128 * {ref-all})D.3382] = iftmp.80; GOMP_atomic_end (); ... which means we're triggering the "expand_omp_atomic_mutex" case for x86_64. Apparently we're triggering the "expand_omp_atomic_pipeline" for powerpc. > Unless someone sees a good way to implement __sync_val_compare_and_swap_16, Hmm, one could implement it in the compiler using calls to GOMP_atomic_start/GOMP_atomic_end, but it feels somewhat hacky. Thanks, - Tom