https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544
--- Comment #15 from Andrew Stubbs <ams at gcc dot gnu.org> --- BTW, if you're calling "new" in the offload kernel then you're probably "doing it wrong", even when we do implement full C++ support. Offload kernels are for hot code, executed many times, and memory allocation is inherently slow. On AMDGCN, "malloc" uses Newlib's heap support and gets serialized via a global lock. Likewise for "free". On NVPTX, the implementation is provided by the PTX finalizer, so may be better optimized, but I still don't recommend it. I'm assuming you're using printf for debug and testing only, so that's fine, but it definitely has no place in hot code either.