On 13-Jan-12 12:23 AM, Nathan Hjelm wrote:
> I would start by adjusting btl_openib_receive_queues . The default uses
> a per-peer QP which can eat up a lot of memory. I recommend using no
> per-peer and several shared receive queues.
> We use S,4096,1024:S,12288,512:S,65536,512

And here's the FAQ entry that explains the logic behind this voodoo option:

http://www.open-mpi.org/faq/?category=openfabrics#ib-receive-queues

-- YK

> -Nathan
> 
> On Thu, 12 Jan 2012, V. Ram wrote:
> 
>> Open MPI IB Gurus,
>>
>> I have some slightly older InfiniBand-equipped nodes with IB which have
>> less RAM than we'd like, and on which we tend to run jobs that can span
>> 16-32 nodes of this type. The jobs themselves tend to run on the heavy
>> side in terms of their own memory requirements.
>>
>> When we used to run on an older Intel MPI, these jobs managed to run
>> within the available RAM without paging out to disk. Now using Open MPI
>> 1.5.3, we can end up paging to disk or even running out of memory for
>> the same codes and exact same jobs and node distributions.
>>
>> I'm suspecting that I can reduce overall memory consumption by tuning
>> the IB-related memory that Open MPI consumes. I've looked at the FAQ:
>> http://www.open-mpi.org/faq/?category=openfabrics#limiting-registered-memory-usage
>> , but I'm still not certain about where I should start. Again, this is
>> all for 1.5.3 (we are willing to update to 1.5.4 or 1.5.5 when released,
>> if it would help).
>>
>> 1. It looks like there are several independent IB BTL MCA parameters to
>> try adjusting: i. mpool_rdma_rcache_size_limit, ii.
>> btl_openib_free_list_max , iii. btl_openib_max_send_size , iv.
>> btl_openib_eager_rdma_num, v. btl_openib_max_eager_rdma, vi.
>> btl_openib_eager_limit . Have I missed any others parameters that
>> impact InfiniBand-related memory usage? These parameters are listed as
>> affecting registered memory. Are there parameters that affect
>> unregistered IB-related memory consumption on the part of Open MPI
>> itself?
>>
>> 2. Where should I start with this? For example, is it worth trying to
>> adjust any of the eager parameters, or are the bulk of the memory
>> requirements coming from the mpool_rdma_rcache_size_limit?
>>
>> 3. Are there any gross/overall "master" parameters that will set limits,
>> but keep the various buffers in intelligent proportion to one another,
>> or will I need to manually adjust each set of buffers independently? If
>> the latter, are there any guidelines on the relative proportions between
>> buffers, or overall recommendations?
>>
>> Thank you very much.
>>
>> -- 
>> http://www.fastmail.fm - A fast, anti-spam email service.
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 

Reply via email to