On Fri, 26 Aug 2022, Tobias Burnus wrote:

> @Tom and Alexander: Better suggestions are welcome for the busy loop in
> libgomp/plugin/plugin-nvptx.c regarding the variable placement and checking
> its value.

I think to do that without polling you can use PTX 'brkpt' instruction on the
device and CUDA Debugger API on the host (but you'd have to be careful about
interactions with the real debugger).

How did the standardization process for this feature look like, how did it pass
if it's not efficiently implementable for the major offloading targets?

Alexander

Reply via email to