Hi, There is no functional btl openib with OpenMPI4.x versions. Point to point communications through infiniband interconnect is provided by UCX pml. To have GPUDirect RDMA, UCX must have been configured with --with-cuda and --with-gdrcopy. Regards, Florent Germain
-----Message d'origine----- De : users <users-boun...@lists.open-mpi.org> De la part de Oskar Lappi via users Envoyé : mercredi 1 juillet 2020 00:24 À : users@lists.open-mpi.org Cc : Oskar Lappi <oskar.la...@abo.fi> Objet : [OMPI users] openib BTL vs UCX. Which do I need to use GPUDirect RDMA? Hi, I'm trying to troubleshoot a problem, we don't seem to be getting the bandwidth we'd expect from our distributed CUDA program, where we're using Open MPI to pass data between GPUs in a HPC cluster. I thought I found a possible root cause, but now I'm unsure of how to fix this, since the documentation provides conflicting information. Running ompi_info --all| grep "MCA btl" gives me the following output: MCA btl: tcp (MCA v2.1.0, API v3.1.0, Component v4.0.2) MCA btl: vader (MCA v2.1.0, API v3.1.0, Component v4.0.2) MCA btl: smcuda (MCA v2.1.0, API v3.1.0, Component v4.0.2) MCA btl: self (MCA v2.1.0, API v3.1.0, Component v4.0.2) According to this: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.open-mpi.org%2Ffaq%2F%3Fcategory%3Druncuda&data=02%7C01%7Cflorent.germain%40atos.net%7C24249795a77e434a02bf08d81d449c4a%7C33440fc6b7c7412cbb730e70b0198d5a%7C0%7C0%7C637291527814335771&sdata=FyYfz71C7CFwys2LW%2FMDLA6er9snFdJfvFNJnERgc%2Bw%3D&reserved=0, the openib btl is a prerequisite for GPUDirect RDMA. However, I'm also reading that UCX is the preferred way to do RDMA and that it has CUDA support. Can anyone tell me what a proper configuration for GPUDirect RDMA over Infiniband looks like? Best regards, Oskar Lappi