If this is a QLogic system why not try psm2 (--mca pml cm --mca mtl psm2)? Not sure how 
good UCX support is over these systems and psm2 is the vendor's library. Not sure what 
the right link is to the current version but found this version: GitHub - 
cornelisnetworks/opa-psm2 github.com -Nathan On Sep 30, 2024, at 10:18 AM, Patrick Begou 
via users <users@lists.open-mpi.org> wrote: Hi, I'm working on refreshing an old 
cluster with Almalinux 9 (instead of CentOS6 😕) and building a fresh OpenMPI 5.0.5 
environment. I've reached the step where OpenMPI begins to work with ucx 1.17 and Pmix 
5.0.3 but not totally. Nodes are using a Qlogic QDR HBA with a managed Qlogic switch 
(40Gb/s) and 1Gb/s ethernet and I've a limited knowledge with all the software stack 
required now with ucx for this hardware. This is the output of osu_bw test between 2 
nodes (in slurm context) bash-5.1$ mpirun --mca pml ucx --mca osc ucx --mca scoll ucx 
--mca atomic ucx osu_bw # OSU MPI Bandwidth Test v7.4 # Datatype: MPI_CHAR. # Size 
Bandwidth (MB/s) 1 0.30 2 0.59 4 1.16 8 2.33 16 4.78 32 9.46 64 18.80 128 36.21 256 
69.61 512 142.48 1024 256.41 2048 498.27 4096 719.19 8192 1010.86 16384 1416.17 32768 
1935.44 65536 2509.17 131072 2786.79 262144 2401.26 524288 500.32 1048576 854.12 2097152 
3114.28 4194304 1830.78 The options come from 
https://docs.open-mpi.org/en/main/tuning-apps/networking/ib-and-roce.html , without them 
it uses the slow ethernet 1Gb/s interface. Running the osu_bibw test is worse as soon as 
the size of the messages increase like if some congestion occurs. # OSU MPI 
Bi-Directional Bandwidth Test v7.4 # Datatype: MPI_CHAR. # Size Bandwidth (MB/s) 1 0.52 
2 1.04 4 2.08 8 4.18 16 8.37 32 16.76 64 33.11 128 65.93 256 130.89 512 248.77 1024 
492.23 2048 1024.23 4096 1622.98 8192 2352.29 16384 1724.83 32768 2309.67 65536 2538.13 
131072 2586.15 262144 95.93 524288 42.83 1048576 63.14 2097152 78.81 4194304 129.66 1) 
I've built ucx 1.17.0 with the gcc 11.4 provided by the OS as I need a thread safe 
version (suggested by Gilles Gouaillardet when I was building UCX for OpenMPI 4.04 on 
another cluster with HDR100 and have some performances troubles) 
../ucx/contrib/configure-release --enable-mt 2) I've built a fresh version of PMIX 5.0.3 
with the gcc 11.4 provided by the OS without specific options: prefix=/usr 
build_srpm=yes build_multiple=yes ./buildrpm.sh ../../pmix-5.0.3.tar.bz2 3) slurm is 
built with PMIX and UCX with the gcc 11.4 provided by the OS 3) Then I've built OpenMPI 
with a fresh install of gcc 14.2 (to have a correct version of the fortran module) 
Configure command line: '--enable-mpirun-prefix-by-default' 
'--prefix=/opt/GCC14/OpenMPI/5.0.5' '--enable-mpi1-compatibility' '--with-slurm' PATH 
and LD_LIBRARY_PATH are set via the module environment tool. Using the old deployment of 
this cluster (same Qlogiq HBA and IB switch) based on openMPI 3.1.3rc1 with openib and 
gcc 7.3, it works fine. Configure command line: 
'--prefix=/share/apps/GCC73/openmpi/31-patch' '--enable-mpirun-prefix-by-default' 
'--disable-dlopen' '--enable-mpi-cxx' '--without-slurm' '--enable-mpi-thread-multiple' # 
OSU MPI Bi-Directional Bandwidth Test v7.4 # Datatype: MPI_CHAR. # Size Bandwidth (MB/s) 
1 1.93 .... 1048576 6034.23 2097152 6028.31 4194304 6033.63 The basic Almalinux packages 
deployed to manage the infiniband network are: - kernel-lt => required for the ib_qib 
module that is not available with Almalinux9 - kernel-lt-devel - infiniband-diags - 
libibumad - rdma-core - ib_qib-ibverbs My ucx threadsafe packages deployed: - 
ucx-threadsafe-1.17.0-1.el9.x86_64 - ucx-threadsafe-devel-1.17.0-1.el9.x86_64 - 
ucx-threadsafe-ib-1.17.0-1.el9.x86_64 - ucx-threadsafe-rdmacm-1.17.0-1.el9.x86_64 - 
ucx-threadsafe-cma-1.17.0-1.el9.x86_64 May be I'm wrong there too. Thanks all for your 
help. Patrick

Reply via email to