Joshua,

I am using a job scheduling system so ulimit –v is set by me. Nevertheless the 
ulimit –l is always half the value of ulimit –v. This is a bit strange, I am 
not sure whether this might be an issue (31GB and 156GB are decent values).

For completeness the output of ulimit –o from one of the nodes

core file size          (blocks, -c) 1
data seg size           (kbytes, -d) 32768000
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 515032
max locked memory       (kbytes, -l) 16460684
max memory size         (kbytes, -m) 56047808
open files                      (-n) 8192
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) 2400
max user processes              (-u) 16308
virtual memory          (kbytes, -v) 32768000
file locks                      (-x) unlimited

Best Regards
Alex

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Joshua Ladd
Sent: Friday, June 20, 2014 9:15 PM
To: Open MPI Users
Subject: Re: [OMPI users] btl_openib_connect_oob.c:867:rml_recv_cb error after 
Infini-band stack update.

Aleksandar,
Please ensure your system administrator follows the guidelines outlined in the 
link printed in the error message

http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
Best,
Josh

On Fri, Jun 20, 2014 at 2:56 PM, Ivanov, Aleksandar (INR) 
<aleksandar.iva...@kit.edu<mailto:aleksandar.iva...@kit.edu>> wrote:
Hi,

I was not the one updating the machine unfortunately, however I can ask my 
colleagues for specific list of modifications done. If I understand you 
correctly you are referring to the “ulimit” parameters. They are properly set, 
in fact we use JMS as job scheduler, therefore the “ulimit -v” is set by the 
user. In my case I used 31GB per MPI process.
The stack size is set to infinity.




From: users 
[mailto:users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>] On 
Behalf Of Ralph Castain
Sent: Friday, June 20, 2014 8:42 PM
To: Open MPI Users
Subject: Re: [OMPI users] btl_openib_connect_oob.c:867:rml_recv_cb error after 
Infini-band stack update.

What was updated? If the OS, did you remember to set the memory registration 
limits to max?


On Jun 20, 2014, at 11:25 AM, Ivanov, Aleksandar (INR) 
<aleksandar.iva...@kit.edu<mailto:aleksandar.iva...@kit.edu>> wrote:


Dear Sir or Madam,

I am using the openmpi 1.6.5 library compiled with IFORT / ICC 13.1.5. Since a 
recent update of our machine I started generating mpi errors. The code crashes 
after completing approx. 24 % from the total job. The same code and input were 
run before on the same machine and no such problems were ever observed. The 
actual error message is attached.
I presume that after the update an incompatibility between the infiniband-stack 
and the openmpi library might have been introduced. I think that the suggested  
“out of memory problem” should not be causing the malfunction, since the 
application uses only 1GB of the total 32 GB available.

I would appreciate your help and ideas how to clarify this issue.

Thank you in advance

Best Regards

Aleksandar Ivanov




<openmpi.log>_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/06/24685.php


_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/06/24687.php

Reply via email to