Dear Open-MPI experts,

I have updated my little cluster from Scientific Linux 6.5 to 6.6,
this included extensive changes in the Infiniband drivers and a newer
openmpi version (1.8.1). Now I'm getting this message on all nodes
with more than 32 GB of RAM:


WARNING: It appears that your OpenFabrics subsystem is configured to only
allow registering part of your physical memory.  This can cause MPI jobs to
run with erratic performance, hang, and/or crash.

This may be caused by your OpenFabrics vendor limiting the amount of
physical memory that can be registered.  You should investigate the
relevant Linux kernel module parameters that control how much physical
memory can be registered, and increase them to allow registering all
physical memory on your machine.

See this Open MPI FAQ item for more information on these Linux kernel module
parameters:

    http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages

  Local host:              pax98
  Registerable memory:     32768 MiB
  Total memory:            49106 MiB

Your MPI job will continue, but may be behave poorly and/or hang.


The issue is similar to the one described in a previous thread about
Ubuntu nodes:
http://www.open-mpi.org/community/lists/users/2014/08/25090.php
But the Infiniband driver is different, the values log_num_mtt and
log_mtts_per_seg both still exist, but they cannot be changed and have
on all configurations the same values:
[pax52] /root # cat /sys/module/mlx4_core/parameters/log_num_mtt
0
[pax52] /root # cat /sys/module/mlx4_core/parameters/log_mtts_per_seg
3

The kernel changelog says that Red Hat has included this commit:
mlx4: Scale size of MTT table with system RAM (Doug Ledford)
so it should be all fine, the buffers scale automatically, however, as
far as I can see, the wrong value calculated by calculate_max_reg() is
used in the code, so I think I cannot simply ignore the warning. Also,
a user has reported a problem with a job, I cannot confirm that this
is the cause.

My workaround was to simply load the mlx5_core kernel module, as this
is used by calculate_max_reg() to detect OFED 2.0.

Regards, Götz Waschk

Reply via email to