jdoerfert added a comment. In D102107#3014759 <https://reviews.llvm.org/D102107#3014759>, @JonChesterfield wrote:
> In D102107#3014743 <https://reviews.llvm.org/D102107#3014743>, @pdhaliwal > wrote: > >> I got this after changing __kmpc_impl_malloc to return 0xdeadbeef. So, this >> confirms that missing malloc implementation is the root cause. >> >>> Memory access fault by GPU node-4 (Agent handle: 0x1bc5000) on address >>> 0xdeadb000. Reason: Page not present or supervisor privilege. > > Nice! In that case I think the way to go is to audit the (probably few) > places where kmpc_impl_malloc are called and add a check for whether the > return value is 0. With that in place we can reland this and get more > graceful failure (at a guess we should fall back to the host when gpu memory > is exhausted? or maybe just print a 'out of gpu heap memory' style message > and abort, don't know). We should only fail to remove the __kmpc_shared_alloc with O0. Since we need __kmpc_shared_alloc for all non-trivial codes, they would always fail on AMDGPU. That said, why is the shared memory stack not catching this. It's a 64 byte stack for the main thread and we are looking at at 24 byte allocation for `declare_mapper_target.cpp`. Can you determine why first two conditionals in `__kmpc_alloc_shared` don't catch this and return proper memory? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D102107/new/ https://reviews.llvm.org/D102107 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits