Sure. The code I'm using is the latest version of Wombat
(https://bitbucket.org/pmendygral/wombat-public/wiki/Home , I'm using an
unreleased updated version as I know the devs). I'm using
OMP_THREAD_NUM=12 and the command line is:
mpirun -np 16 --hostfile hosts ./wombat
Where the host file lists 4 machines, so 4 ranks per machine and 12
threads per rank. Each node has 48 Intel Cascade Lake cores. I've also
tried using the Slurm scheduler version which is:
srun -n 16 -c 12 --mpi=pmix ./wombat
Which also hangs. It works if I constrain to one or two nodes but any
greater than that hangs. As for network hardware:
[root@holy7c02101 ~]# ibstat
CA 'mlx5_0'
CA type: MT4119
Number of ports: 1
Firmware version: 16.25.6000
Hardware version: 0
Node GUID: 0xb8599f0300158f20
System image GUID: 0xb8599f0300158f20
Port 1:
State: Active
Physical state: LinkUp
Rate: 100
Base lid: 808
LMC: 1
SM lid: 584
Capability mask: 0x2651e848
Port GUID: 0xb8599f0300158f20
Link layer: InfiniBand
[root@holy7c02101 ~]# lspci | grep Mellanox
58:00.0 Infiniband controller: Mellanox Technologies MT27800 Family
[ConnectX-5]
As for IB RDMA kernel stack we are using the default drivers that come
with CentOS 7.6.1810 which is rdma core 17.2-3.
I will note that I successfully ran an old version of Wombat on all
30,000 cores of this system using OpenMPI 3.1.3 and regular IB Verbs
with no problem earlier this week, though that was pure MPI ranks with
no threads. Nonetheless the fabric itself is healthy and in good
shape. It seems to be this edge case using the latest OpenMPI with UCX
and threads that is causing the hang ups. To be sure the latest version
of Wombat (as I believe the public version does as well) uses many of
the state of the art MPI RMA direct calls, so its definitely pushing the
envelope in ways our typical user base here will not. Still it would be
good to iron out this kink so if users do hit it we have a solution. As
noted UCX is very new to us and thus it is entirely possible that we are
missing something in its interaction with OpenMPI. Our MPI is compiled
thusly:
https://github.com/fasrc/helmod/blob/master/rpmbuild/SPECS/centos7/openmpi-4.0.1-fasrc01.spec
I will note that when I built this it was built using the default
version of UCX that comes with EPEL (1.5.1). We only built 1.6.0 as the
version provided by EPEL did not build with MT enabled, which to me
seems strange as I don't see any reason not to build with MT enabled.
Anyways that's the deeper context.
-Paul Edmon-
On 8/23/2019 5:49 PM, Joshua Ladd via users wrote:
Paul,
Can you provide a repro and command line, please. Also, what network
hardware are you using?
Josh
On Fri, Aug 23, 2019 at 3:35 PM Paul Edmon via users
<users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> wrote:
I have a code using MPI_THREAD_MULTIPLE along with MPI-RMA that I'm
using OpenMPI 4.0.1. Since 4.0.1 requires UCX I have it installed
with
MT on (1.6.0 build). The thing is that the code keeps stalling
out when
I go above a couple of nodes. UCX is new to our environment as
previously we have just used the regular IB Verbs with no
problem. My
guess is that there is either some option in OpenMPI I am missing or
some variable in UCX I am not setting. Any insight on what could be
causing the stalls?
-Paul Edmon-
_______________________________________________
users mailing list
users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users