Put “oob=tcp” in your default MCA param file > On Oct 18, 2017, at 9:00 AM, Mark Dixon <m.c.di...@leeds.ac.uk> wrote: > > Hi, > > We're intermittently seeing messages (below) about failing to register memory > with openmpi 2.0.2 on centos7 / Mellanox FDR Connect-X 3 and the vanilla IB > stack as shipped by centos. > > We're not using any mlx4_core module tweaks at the moment. On earlier > machines we used to set registered memory as per the FAQ, but neither > log_num_mtt nor num_mtt seem to exist these days (according to > /sys/module/mlx4_*/parameters/*), which makes it somewhat difficult to follow > the FAQ. > > The output of 'ulimit -l' shows as unlimited for every rank. > > Does anyone have any advice, please? > > Thanks, > > Mark > > ------------------------------------------------------------------------- > Failed to register memory region (MR): > > Hostname: dc1s0b1c > Address: ec5000 > Length: 20480 > Error: Cannot allocate memory > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > Open MPI has detected that there are UD-capable Verbs devices on your > system, but none of them were able to be setup properly. This may > indicate a problem on this system. > > You job will continue, but Open MPI will ignore the "ud" oob component > in this run. > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users