On 13-Jan-12 12:23 AM, Nathan Hjelm wrote: > I would start by adjusting btl_openib_receive_queues . The default uses > a per-peer QP which can eat up a lot of memory. I recommend using no > per-peer and several shared receive queues. > We use S,4096,1024:S,12288,512:S,65536,512
And here's the FAQ entry that explains the logic behind this voodoo option: http://www.open-mpi.org/faq/?category=openfabrics#ib-receive-queues -- YK > -Nathan > > On Thu, 12 Jan 2012, V. Ram wrote: > >> Open MPI IB Gurus, >> >> I have some slightly older InfiniBand-equipped nodes with IB which have >> less RAM than we'd like, and on which we tend to run jobs that can span >> 16-32 nodes of this type. The jobs themselves tend to run on the heavy >> side in terms of their own memory requirements. >> >> When we used to run on an older Intel MPI, these jobs managed to run >> within the available RAM without paging out to disk. Now using Open MPI >> 1.5.3, we can end up paging to disk or even running out of memory for >> the same codes and exact same jobs and node distributions. >> >> I'm suspecting that I can reduce overall memory consumption by tuning >> the IB-related memory that Open MPI consumes. I've looked at the FAQ: >> http://www.open-mpi.org/faq/?category=openfabrics#limiting-registered-memory-usage >> , but I'm still not certain about where I should start. Again, this is >> all for 1.5.3 (we are willing to update to 1.5.4 or 1.5.5 when released, >> if it would help). >> >> 1. It looks like there are several independent IB BTL MCA parameters to >> try adjusting: i. mpool_rdma_rcache_size_limit, ii. >> btl_openib_free_list_max , iii. btl_openib_max_send_size , iv. >> btl_openib_eager_rdma_num, v. btl_openib_max_eager_rdma, vi. >> btl_openib_eager_limit . Have I missed any others parameters that >> impact InfiniBand-related memory usage? These parameters are listed as >> affecting registered memory. Are there parameters that affect >> unregistered IB-related memory consumption on the part of Open MPI >> itself? >> >> 2. Where should I start with this? For example, is it worth trying to >> adjust any of the eager parameters, or are the bulk of the memory >> requirements coming from the mpool_rdma_rcache_size_limit? >> >> 3. Are there any gross/overall "master" parameters that will set limits, >> but keep the various buffers in intelligent proportion to one another, >> or will I need to manually adjust each set of buffers independently? If >> the latter, are there any guidelines on the relative proportions between >> buffers, or overall recommendations? >> >> Thank you very much. >> >> -- >> http://www.fastmail.fm - A fast, anti-spam email service. >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >