JonChesterfield wrote:

Oh. I now see there was a bunch of discussion about this, will add some context.

The driver has a hard limit on how many processes can open it at a time. clang 
calls this utility to ask what gpu to compile for by default. If you put those 
together, a parallel build on a vaguely modern desktop immediately blows 
through that process limit and proceeds to fail, so the user has to 
deliberately build code slowly or specify the gpu by hand, to work around our 
tooling falling over.

The limit on number of processes is generally reasonable - launching hundreds 
of processes that all open the GPU and allocate queues and launch kernels is 
generally a disaster and having the kernel return an equivalent to "too many 
open" is great. However in the specific case where we are only asking for 
trivial information, which doesn't need to allocate a queue or do anything 
whatsoever with the GPU, this is a spurious and annoying limitation.

I suppose it could be "fixed" in the driver - some sort of reference count 
which is incremented when you do something non-trivial instead of on open - but 
I'd expect the kernel people to tell us to stop opening hundreds of processes 
when none of them do any work.

One clumsy outstanding thing is that this should now be some code in a header 
that clang includes so that instead of the subprocess shell to handle 
arch=native clang just looks up the information directly.

@b-sumner does the additional context make the design choice clear?

https://github.com/llvm/llvm-project/pull/116651
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to