On Wed, 21 Oct 2015, Jakub Jelinek wrote: > > time (libcudadevrt.a), and imposes overhead at run time. The last point > > might > > But if this is the case, that is really serious issue. Is that really > something that isn't available in a shared library? > E.g. with my distro GCC maintainer hat on, I'd really like to tweak the > libgomp PTX plugin, so that it compiles against a stub cuda.h header and > doesn't like against libcuda*.so at all, but instead dlopens it, to avoid > hard dependencies on the non-free CUDA stuff and more importantly any link > time dependencies on that. If libcudadevrt is not > available as shared library, this wouldn't of course work. Would be nice to > talk to NVidia about this...
It's a library of device (PTX) code, not host code, so dynamic linking does not apply. > > libgomp.c/thread-limit-2.c: fails to link due to 'usleep' unavailable on > > NVPTX. Note, the test does not run anything on the device because the > > target > > region has 'if (0)' clause. > > As optimization, perhaps we could avoid adding the "omp target entrypoint" > attribute for the body of if(0) target region, that one always goes to host > fallback, so no offloaded code is needed. > > As for other tests, XFAILing them always is undesirable, supposedly we could > add a dejagnu target check whether the default target goes to PTX (if we > don't have it already) and use that to xfail? Yes, that's what I meant; such a check is already implemented for OpenACC. > Of course that doesn't help the thread-limit-2.c testcase. Why not? Alexander