https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114445
--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Tobias Burnus <bur...@gcc.gnu.org>: https://gcc.gnu.org/g:da5803c794d16deb461c93588461856fbf6e54ac commit r16-3925-gda5803c794d16deb461c93588461856fbf6e54ac Author: Tobias Burnus <tbur...@baylibre.com> Date: Wed Sep 17 08:47:36 2025 +0200 libgomp: Init hash table for 'indirect'-clause of 'declare target' on the host [PR114445, PR119857] Especially with unified-shared memory and especially with C++'s virtual functions, it is not uncommon to have on the device a function pointer that points to the host function - but has an associated device. If the pointed-to function is (explicitly or implicitly) 'declare target' with the 'indirect' clause, it is added to the lookup table. Before this commit, the conversion of the lookup table into a lookup hash table happened every time a device kernel was launched on the first team - albeit if already converted, the function immediately returned. Ignoring the overhead, there was also a race: If multiple teams were launched, it could happen that another team of the same target region already tried to use the lookup table which it was still being created. Likewise when lauching a kernel with 'nowait' and directly afterward another kernel, there could be a race of creating the table. With this commit, the creating of the kernel has been moved to the host-plugin's GOMP_OFFLOAD_load_image. The previous code stored a pointer to the host/device pointer array, which makes it hard when creating the hash table on the host (data is needed for finding the slot) - but accessing it on the device (where the lookup has to work as well). As the hash-table implementation (only) supports integral value as payload (0 and 1 having special meaning), the solution was to move to an uint128_t variable to store both the host and device address. As the host-side library is typically dynamically linked and the device-side one statically, there is the problem of backward compatibility. The current implementation permits both older binaries and newer libgomp and newer binaries with older libgomp. I could imagine us breaking the latter eventually, but for now there is up and downward compatibility. (Obviously, the race is only fixed if new + new is combined.) Code wise, on the device exist GOMP_INDIRECT_ADDR_MAP which was updated to point to the host/device-address array. Now additionally GOMP_INDIRECT_ADDR_HMAP exists, which contains the hash-table map. If the latter exists, libgomp only updates it and the former remains a NULL pointer; it is also untouched if there are no indirect functions. Being NULL therefore avoids the call to the device-side build_indirect_map. The code also currently supports to have no hash and a linear walk. I think that remained from testing; due to the backward-compat feature, it can actually be turned of on either side. libgomp/ChangeLog: PR libgomp/119857 PR libgomp/114445 * config/accel/target-indirect.c: Change to use uint128_t instead of a struct as data structure and add GOMP_INDIRECT_ADDR_HMAP as host-accessible variable. (struct indirect_map_t): Remove. (USE_HASHTAB_LOOKUP, INDIRECT_DEV_ADDR, INDIRECT_HOST_ADDR, SET_INDIRECT_HOST_ADDR, SET_INDIRECT_ADDRS): Define. (htab_free): Use __builtin_unreachable. (htab_hash, htab_eq, GOMP_target_map_indirect_ptr, build_indirect_map): Update for new representation and new pointer-to-hash variable. * config/gcn/team.c (gomp_gcn_enter_kernel): Only call build_indirect_map when GOMP_INDIRECT_ADDR_MAP. * config/nvptx/team.c (gomp_nvptx_main): Likewise. * libgomp-plugin.h (GOMP_INDIRECT_ADDR_HMAP): Define. * plugin/plugin-gcn.c: Conditionally include build-target-indirect-htab.h. (USE_HASHTAB_LOOKUP_FOR_INDIRECT): Define. (create_target_indirect_map): New prototype. (GOMP_OFFLOAD_load_image): Update to create the device's indirect-function hash table on the host. * plugin/plugin-nvptx.c: Conditionally include build-target-indirect-htab.h. (USE_HASHTAB_LOOKUP_FOR_INDIRECT): Define. (create_target_indirect_map): New prototype. (GOMP_OFFLOAD_load_image): Update to create the device's indirect-function hash table on the host. * plugin/build-target-indirect-htab.h: New file.