Re: [OMPI users] Are there ways to reduce the memory used by OpenMPI?

Jeff Squyres Sat, 3 Oct 2009 07:08:29 -0400

On Oct 1, 2009, at 2:56 PM, Blosch, Edwin L wrote:

Are there are tuning parameters than I can use to reduce the amountof memory used by OpenMPI? I would very much like to use OpenMPIinstead of MVAPICH, but I’m on a cluster where memory usage is themost important consideration. Here are three results which capturethe problem:
With the “leave_pinned” behavior turned on, I get good performance(19.528, lower is better)
mpirun --prefix /usr/mpi/intel/openmpi-1.2.8 --machinefile

FWIW, there have been a lot of improvements in Open MPI since the 1.2series (including some memory reduction work) -- is it possible foryou to upgrade to the latest 1.3 release?

/var/spool/torque/aux/7972.fwnaeglingio -np 28 --mca btl ^tcp --mcampi_leave_pinned 1 --mca mpool_base_use_mem_hooks 1 -xLD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 /tmp/7972.fwnaeglingio/falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/7972.fwnaeglingio/restart.0
Compute rate (processor-microseconds/cell/cycle):   19.528
Total memory usage:    38155.3477 MB (38.1553 GB)
Turning off the leave_pinned behavior, I get considerably slowerperformance (28.788), but the memory usage is unchanged (still 38 GB)
mpirun --prefix /usr/mpi/intel/openmpi-1.2.8 --machinefile /var/spool/torque/aux/7972.fwnaeglingio -np 28 -x LD_LIBRARY_PATH -xMPI_ENVIRONMENT=1 /tmp/7972.fwnaeglingio/falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/7972.fwnaeglingio/restart.0
Compute rate (processor-microseconds/cell/cycle):   28.788
Total memory usage:    38335.7656 MB (38.3358 GB)

I would guess that you are continually re-using the same communicationbuffers -- doing so will definitely be better withmpi_leave_pinned=1. Note, too, that mpi_leave_pinned is on by defaultfor OpenFabrics networks in the Open MPI 1.3 series.

Using MVAPICH, the performance is in the middle (23.6), but thememory usage is reduced by 5 to 6 GB out of 38 GB, a significantdecrease to me.
/usr/mpi/intel/mvapich-1.1.0/bin/mpirun_rsh -ssh -np 28 -hostfile /var/spool/torque/aux/7972.fwnaeglingio LD_LIBRARY_PATH="/usr/mpi/intel/mvapich-1.1.0/lib/shared:/usr/mpi/intel/openmpi-1.2.8/lib64:/appserv/intel/fce/10.1.008/lib:/appserv/intel/cce/10.1.008/lib"MPI_ENVIRONMENT=1 /tmp/7972.fwnaeglingio/falconv4_ibm_mvapich -cycles 100 -ri restart.0 -ro /tmp/7972.fwnaeglingio/restart.0
Compute rate (processor-microseconds/cell/cycle):   23.608
Total memory usage:    32753.0586 MB (32.7531 GB)
I didn’t see anything in the FAQ that discusses memory usage otherthan the impact of the “leave_pinned” option, which apparently doesnot affect the memory usage in my case. But I figure there must bea justification why OpenMPI would use 6 GB more than MVAPICH on thesame case.

Try the 1.3 series; we do have a bunch of knobs in there for memoryusage -- there were significant changes/advancements in the 1.3 serieswith regards to how OpenFabrics buffers are registered. Get abaseline on that memory usage, and then let's see what you want to dofrom there.


--
Jeff Squyres
jsquy...@cisco.com

Re: [OMPI users] Are there ways to reduce the memory used by OpenMPI?

Reply via email to