Did you build UCX with CUDA support (--with-cuda) ?

Josh

On Thu, Sep 5, 2019 at 8:45 PM AFernandez via users <
users@lists.open-mpi.org> wrote:

> Hello OpenMPI Team,
>
> I'm trying to use CUDA-aware OpenMPI but the system simply ignores the GPU
> and the code runs on the CPUs. I've tried different software but will focus
> on the OSU benchmarks (collective and pt2pt communications). Let me provide
> some data about the configuration of the system:
>
> -OFED v4.17-1-rc2 (the NIC is virtualized but I also tried a Mellanox card
> with MOFED a few days ago and found the same issue)
>
> -CUDA v10.1
>
> -gdrcopy v1.3
>
> -UCX 1.6.0
>
> -OpenMPI 4.0.1
>
> Everything looks like good (CUDA programs work fine, MPI programs run on
> the CPUs without any problem), and the ompi_info outputs what I was
> expecting (but maybe I'm missing something):
>
>
> mca:opal:base:param:opal_built_with_cuda_support:synonym:name:mpi_built_with_cuda_support
>
> mca:mpi:base:param:mpi_built_with_cuda_support:value:true
>
> mca:mpi:base:param:mpi_built_with_cuda_support:source:default
>
> mca:mpi:base:param:mpi_built_with_cuda_support:status:read-only
>
> mca:mpi:base:param:mpi_built_with_cuda_support:level:4
>
> mca:mpi:base:param:mpi_built_with_cuda_support:help:Whether CUDA GPU
> buffer support is built into library or not
>
> mca:mpi:base:param:mpi_built_with_cuda_support:enumerator:value:0:false
>
> mca:mpi:base:param:mpi_built_with_cuda_support:enumerator:value:1:true
>
> mca:mpi:base:param:mpi_built_with_cuda_support:deprecated:no
>
> mca:mpi:base:param:mpi_built_with_cuda_support:type:bool
>
>
> mca:mpi:base:param:mpi_built_with_cuda_support:synonym_of:name:opal_built_with_cuda_support
>
> mca:mpi:base:param:mpi_built_with_cuda_support:disabled:false
>
> The available btls are the usual self, openib, tcp & vader plus smcuda,
> uct & usnic. The full output from ompi_info is attached. If I try the flag
> '--mca opal_cuda_verbose 10,' it doesn't output anything, which seems to
> agree with the lack of GPU use. If I try with '--mca btl smcuda,' it makes
> no difference. I have also tried to specify the program to use host and
> device (e.g. mpirun -np 2 ./osu_latency D H) but the same result. I am
> probably missing something but not sure where else to look at or what else
> to try.
>
> Thank you,
>
> AFernandez
>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to