Thanks Ralph, will do.

Cheers,

Mark

On Wed, 18 Oct 2017, r...@open-mpi.org wrote:

Put “oob=tcp” in your default MCA param file

On Oct 18, 2017, at 9:00 AM, Mark Dixon <m.c.di...@leeds.ac.uk> wrote:

Hi,

We're intermittently seeing messages (below) about failing to register memory 
with openmpi 2.0.2 on centos7 / Mellanox FDR Connect-X 3 and the vanilla IB 
stack as shipped by centos.

We're not using any mlx4_core module tweaks at the moment. On earlier machines 
we used to set registered memory as per the FAQ, but neither log_num_mtt nor 
num_mtt seem to exist these days (according to 
/sys/module/mlx4_*/parameters/*), which makes it somewhat difficult to follow 
the FAQ.

The output of 'ulimit -l' shows as unlimited for every rank.

Does anyone have any advice, please?

Thanks,

Mark

-------------------------------------------------------------------------
Failed to register memory region (MR):

Hostname: dc1s0b1c
Address:  ec5000
Length:   20480
Error:    Cannot allocate memory
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Open MPI has detected that there are UD-capable Verbs devices on your
system, but none of them were able to be setup properly.  This may
indicate a problem on this system.

You job will continue, but Open MPI will ignore the "ud" oob component
in this run.
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to