Re: [OMPI users] qp memory allocation problem

2011-09-12 Thread Jeff Squyres
On Sep 12, 2011, at 12:39 PM, Shamis, Pavel wrote: > OMPI Developers: > > Maybe we should consider disabling the use of per-peer queue pairs by > default. Do they buy us anything? For what it is worth, we have stopped > using them on all of our large systems here at LANL. > > It is cons-and-

Re: [OMPI users] qp memory allocation problem

2011-09-12 Thread Blosch, Edwin L
er 12, 2011 11:39 AM To: Open MPI Users Subject: EXTERNAL: Re: [OMPI users] qp memory allocation problem Alternative solution for the problem is updating your memory limits Please see below: http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages Apparently you memory limit is low and

Re: [OMPI users] qp memory allocation problem

2011-09-12 Thread Shamis, Pavel
Alternative solution for the problem is updating your memory limits Please see below: http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages Apparently you memory limit is low and the driver fails to create QPs What happens when you add the following to your mpirun command? -mca btl

Re: [OMPI users] qp memory allocation problem

2011-09-12 Thread Teng Ma
I met a similar problem possibly related with QP memory allocation. I run 768 processes' allgather with 1MB message size but by node binding(forcing the edge of Tuned's ring algorithm through IB links every time). The IMB test hang over there more than 3 hours without any output. I don't know whe

Re: [OMPI users] qp memory allocation problem

2011-09-12 Thread Nathan Hjelm
I also recommend checking the log_mtts_per_set parameter to the mlx4 module. This parameter controls how much memory can be registered for use by the mlx4 driver and it should be in the range 1-5 (or 0-7 depending on the version of the mlx4 driver). I recommend tthe parameter be set such that y

Re: [OMPI users] qp memory allocation problem

2011-09-12 Thread Samuel K. Gutierrez
Hi, This problem can be caused by a variety of things, but I suspect our default queue pair parameters (QP) aren't helping the situation :-). What happens when you add the following to your mpirun command? -mca btl_openib_receive_queues S,4096,128:S,12288,128:S,65536,12 OMPI Developers: Mayb

[OMPI users] qp memory allocation problem

2011-09-12 Thread Blosch, Edwin L
I am getting this error message below and I don't know what it means or how to fix it. It only happens when I run on a large number of processes, e.g. 960. Things work fine on 480, and I don't think the application has a bug. Any help is appreciated... [c1n01][[30697,1],3][connect/btl_openi