[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

Yaxun Liu via cfe-commits Mon, 18 Nov 2024 09:54:39 -0800

yxsamliu wrote:

> > > @jhuber6 can you comment on "lot of overhead" and if that matters? Also, 
> > > not sure why the HSA library dependence is a problem. This seems to be 
> > > exposing amdgpu-arch to more maintenance overhead.
> > 
> > 
> > Sometimes the driver will hang and since this is used inside of `clang` to 
> > support `--offload-arch=native` I've had cases where the compiler hangs 
> > forever, so I added a timeout to keep it from doing that in the past. This 
> > removes that possibility entirely. I have also had reports from cluster 
> > users that it becomes very slow when others are stressing the GPU. It's 
> > faster and since this will be installed on every single LLVM build, not 
> > everyone has ROCm so it would be nice for this to work. I think ti's fair 
> > to do this as the fast-path on Linux systems and then fall-back to HIP if 
> > something goes terribly wrong.
> 
> I don't really understand why cluster users are compiling on a system where 
> the GPUs are being stressed, and I still don't see why it's a good idea to 
> break layering for this case. Also, I wasn't aware that the "native" offload 
> arch is supported by ROCm.


The issue also happens to machines with one GPU if amdgpu-arch is executed 
multiple times in a short time due to some limitation of the driver. 
--offload-arch=native calls amdpu-arch to get the actual GPU archs.

https://github.com/llvm/llvm-project/pull/116651
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

Reply via email to