Hi!

After solving my last problem with the help of this list (thanks
again :) I encountered another problem regarding the memory allocation
for the openib component.
If I try to run an arbitrary MPI program, e.g. with
    $ orterun -np 2 --bynode --host node01,node02 \
          --prefix /usr/local ./mpptest -gnuplot
the following error appears:
    [0,1,1][btl_openib.c:787:mca_btl_openib_module_init] error creating
        high priority cq for mthca0 errno says Cannot allocate memory
Obviously, the error occurs only on node02 and not on the local node01
although they are both configured identically. The hosts were cloned
using SystemImager and the problem is symmetric (it always fails on the
remote host).
The FAQ (see
http://www.open-mpi.org/faq/?category=infiniband#ib-locked-pages) blames
that on insufficient user rights for locking memory. So I
adjusted /etc/security/limits.conf and set the hard and soft lilit to
unlimited, but the error remains. The limits are applied correctly as
the command
    $ orterun -np 2 --bynode --host node01,node02 \
          --prefix /usr/local /bin/bash -c ulimit -l
    unlimited
    unlimited
indicated.

The libraries involved are OpenMPI 1.0.2-a7 with libibverbs-1.0-rc5 and
libmthca-1.0-rc5 on Debian sarge with kernel 2.6.15 (from
www.backports.org). There is 8 GB RAM and 16 GB swap available. While
running the program less than 1 GB is used. CQ size is at default
(1000).

Thanks,
  Emanuel

Reply via email to