Hi Gus,

For single node runs, don't bother specifying the btl.  Openmpi should select 
the best option.

Beyond that, the "80% total RAM" recommendation is misleading. Base your N off 
the memfree rather than memtotal. IB can reserve quite a bit.  Verify your 
/etc/security/limits.conf limits allow sufficient locking.  (Try unlimited) 

Finally, P should be smaller than Q, and squarer values are recommended.

With Shanghai, OpenMPI, GotoBLAS expect single node efficiency of a least 85% 
given decent tuning.  If the distribution continues to look strange, there are 
more things to check.

Thanks, Jacob

> -----Original Message-----
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
> Behalf Of Gus Correa
> Sent: Friday, May 01, 2009 12:17 PM
> To: Open MPI Users
> Subject: [OMPI users] HPL with OpenMPI: Do I have a memory leak?
> 
> Hi OpenMPI and HPC experts
> 
> This may or may not be the right forum to post this,
> and I am sorry to bother those that think it is not.
> 
> I am trying to run the HPL benchmark on our cluster,
> compiling it with Gnu and linking to
> GotoBLAS (1.26) and OpenMPI (1.3.1),
> both also Gnu-compiled.
> 
> I have got failures that suggest a memory leak when the
> problem size is large, but still within the memory limits
> recommended by HPL.
> The problem only happens when "openib" is among the OpenMPI
> MCA parameters (and the problem size is large).
> Any help is appreciated.
> 
> Here is a description of what happens.
> 
> For starters I am trying HPL on a single node, to get a feeling for
> the right parameters (N & NB, P & Q, etc) on dual-socked quad-core
> AMD Opteron 2376 "Shanghai"
> 
> The HPL recommendation is to use close to 80% of your physical memory,
> to reach top Gigaflop performance.
> Our physical memory on a node is 16GB, and this gives a problem size
> N=40,000 to keep the 80% memory use.
> I tried several block sizes, somewhat correlated to the size of the
> processor cache:  NB=64 80 96 128 ...
> 
> When I run HPL with N=20,000 or smaller all works fine,
> and the HPL run completes, regardless of whether "openib"
> is present or not on my MCA parameters.
> 
> However, moving when I move N=40,000, or even N=35,000,
> the run starts OK with NB=64,
> but as NB is switched to larger values
> the total memory use increases in jumps (as shown by Ganglia),
> and becomes uneven across the processors (as shown by "top").
> The problem happens if "openib" is among the MCA parameters,
> but doesn't happen if I remove "openib" from the MCA list and use
> only "sm,self".
> 
> For N=35,000, when NB reaches 96 memory use is already above the
> physical limit
> (16GB), having increased from 12.5GB to over 17GB.
> For N=40,000 the problem happens even earlier, with NB=80.
> At this point memory swapping kicks in,
> and eventually the run dies with memory allocation errors:
> 
> =======================================================================
> =========
> T/V                N    NB     P     Q               Time
>    Gflops
> -----------------------------------------------------------------------
> ---------
> WR01L2L4       35000   128     8     1             539.66
> 5.297e+01
> -----------------------------------------------------------------------
> ---------
> ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0043992
> ...... PASSED
> HPL ERROR from process # 0, on line 172 of function HPL_pdtest:
>  >>> [7,0] Memory allocation failed for A, x and b. Skip. <<<
> ...
> 
> ***
> 
> The code snippet that corresponds to HPL_pdest.c is this,
> although the leak is probably somewhere else:
> 
> /*
>   * Allocate dynamic memory
>   */
>     vptr = (void*)malloc( ( (size_t)(ALGO->align) +
>                             (size_t)(mat.ld+1) * (size_t)(mat.nq) ) *
>                           sizeof(double) );
>     info[0] = (vptr == NULL); info[1] = myrow; info[2] = mycol;
>     (void) HPL_all_reduce( (void *)(info), 3, HPL_INT, HPL_max,
>                            GRID->all_comm );
>     if( info[0] != 0 )
>     {
>        if( ( myrow == 0 ) && ( mycol == 0 ) )
>           HPL_pwarn( TEST->outfp, __LINE__, "HPL_pdtest",
>                      "[%d,%d] %s", info[1], info[2],
>                      "Memory allocation failed for A, x and b. Skip." );
>        (TEST->kskip)++;
>        return;
>     }
> 
> ***
> 
> I found this continued increase in memory use rather strange,
> and suggestive of a memory leak in one of the codes being used.
> 
> Everything (OpenMPI, GotoBLAS, and HPL)
> was compiled using Gnu only (gcc, gfortran, g++).
> 
> I haven't changed anything on the compiler's memory model,
> i.e., I haven't used or changed the "-mcmodel" flag of gcc
> (I don't know if the Makefiles on HPL, GotoBLAS, and OpenMPI use it.)
> 
> No additional load is present on the node,
> other than the OS (Linux CentOS 5.2), HPL is running alone.
> 
> The cluster has Infiniband.
> However, I am running on a single node.
> 
> The surprising thing is that if I run on shared memory only
> (-mca btl sm,self) there is no memory problem,
> the memory use is stable at about 13.9GB,
> and the run completes.
> So, there is a way around to run on a single node.
> (Actually shared memory is presumably the way to go on a single node.)
> 
> However, if I introduce IB (-mca btl openib,sm,self)
> among the MCA btl parameters, then memory use blows up.
> 
> This is bad news for me, because I want to extend the experiment
> to run HPL also across the whole cluster using IB,
> which is actually the ultimate goal of HPL, of course!
> It also suggests that the problem is somehow related to Infiniband,
> maybe hidden under OpenMPI.
> 
> Here is the mpiexec command I use (with and without openib):
> 
> /path/to/openmpi/bin/mpiexec \
>          -prefix /the/run/directory \
>          -np 8 \
>          -mca btl [openib,]sm,self \
>          xhpl
> 
> 
> Any help, insights, suggestions, reports of previous experiences,
> are much appreciated.
> 
> Thank you,
> Gus Correa
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to