I’ve begun getting this annoyingly generic warning, too. It appears to be coming from the openib provider. If you disable it with -mtl ^openib the warning goes away.
Sent from my iPad > On Mar 13, 2021, at 3:28 PM, Bob Beattie via users <users@lists.open-mpi.org> > wrote: > > Hi everyone, > > To be honest, as an MPI / IB noob, I don't know if this falls under OpenMPI > or Mellanox.... > > Am running a small cluster of HP DL380 G6/G7 machines. > Each runs Ubuntu server 20.04 and has a Mellanox ConnectX-3 card, connected > by an IS dumb switch. > When I begin my MPI program (snappyHexMesh for OpenFOAM) I get an error > reported. > The error doesn't stop my programs or appear to cause any problems, so this > request for help is more about delving into the why. > > OMPI is compiled from source using v4.0.3; which is the default version for > Ubuntu 20.04 > This compiles and works. I did this because I wanted to understand the > compilation process whilst using a known working OMPI version. > > The Infiniband part is the Mellanox MLNXOFED installer v4.9-0.1.7.0 and I > install that with --dkms --without-fw-update --hpc --with-nfsrdma > > The actual error reported is: > Warning: There was an error initialising an OpenFabrics device. > Local host: of1 > Local device: mlx4_0 > > Then shortly after: > [of1:1015399] 19 more processes have sent help message > help-mpi-btl-openib.txt / error in device init > [of1:1015399] Set MCA parameter "orte_base_help_aggregate" to 0 to see all > help / error messages > > Adding this MCA parameter to the mpirun line simply gives me 20 or so copies > of the first warning. > > Any ideas anyone ? > Cheers, > Bob.