yxsamliu wrote: > > > @jhuber6 can you comment on "lot of overhead" and if that matters? Also, > > > not sure why the HSA library dependence is a problem. This seems to be > > > exposing amdgpu-arch to more maintenance overhead. > > > > > > Sometimes the driver will hang and since this is used inside of `clang` to > > support `--offload-arch=native` I've had cases where the compiler hangs > > forever, so I added a timeout to keep it from doing that in the past. This > > removes that possibility entirely. I have also had reports from cluster > > users that it becomes very slow when others are stressing the GPU. It's > > faster and since this will be installed on every single LLVM build, not > > everyone has ROCm so it would be nice for this to work. I think ti's fair > > to do this as the fast-path on Linux systems and then fall-back to HIP if > > something goes terribly wrong. > > I don't really understand why cluster users are compiling on a system where > the GPUs are being stressed, and I still don't see why it's a good idea to > break layering for this case. Also, I wasn't aware that the "native" offload > arch is supported by ROCm.
The issue also happens to machines with one GPU if amdgpu-arch is executed multiple times in a short time due to some limitation of the driver. --offload-arch=native calls amdpu-arch to get the actual GPU archs. https://github.com/llvm/llvm-project/pull/116651 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits