Thanks Ralph, will do.
Cheers,
Mark
On Wed, 18 Oct 2017, r...@open-mpi.org wrote:
Put “oob=tcp” in your default MCA param file
On Oct 18, 2017, at 9:00 AM, Mark Dixon <m.c.di...@leeds.ac.uk> wrote:
Hi,
We're intermittently seeing messages (below) about failing to register memory
with openmpi 2.0.2 on centos7 / Mellanox FDR Connect-X 3 and the vanilla IB
stack as shipped by centos.
We're not using any mlx4_core module tweaks at the moment. On earlier machines
we used to set registered memory as per the FAQ, but neither log_num_mtt nor
num_mtt seem to exist these days (according to
/sys/module/mlx4_*/parameters/*), which makes it somewhat difficult to follow
the FAQ.
The output of 'ulimit -l' shows as unlimited for every rank.
Does anyone have any advice, please?
Thanks,
Mark
-------------------------------------------------------------------------
Failed to register memory region (MR):
Hostname: dc1s0b1c
Address: ec5000
Length: 20480
Error: Cannot allocate memory
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Open MPI has detected that there are UD-capable Verbs devices on your
system, but none of them were able to be setup properly. This may
indicate a problem on this system.
You job will continue, but Open MPI will ignore the "ud" oob component
in this run.
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users