On Fri, 26 Aug 2022, Tobias Burnus wrote:
> @Tom and Alexander: Better suggestions are welcome for the busy loop in > libgomp/plugin/plugin-nvptx.c regarding the variable placement and checking > its value. I think to do that without polling you can use PTX 'brkpt' instruction on the device and CUDA Debugger API on the host (but you'd have to be careful about interactions with the real debugger). How did the standardization process for this feature look like, how did it pass if it's not efficiently implementable for the major offloading targets? Alexander