Charles,

ucx has a higher priority than ob1, that is why it is used by default 
when available.


If you can provide simple instructions on how to build and test one of 
the apps that experiment
a memory leak, that would greatly help us and the UCX folks reproduce, 
troubleshoot and diagnose this issue.


Cheers,

Gilles

----- Original Message -----
> 
> > On Oct 5, 2018, at 11:31 AM, Gilles Gouaillardet <gilles.
gouaillar...@gmail.com> wrote:
> > 
> > are you saying that even if you
> > 
> >     mpirun --mca pml ob1 ...
> > 
> > (e.g. force the ob1 component of the pml framework) the memory leak 
is
> > still present ?
> 
> No, I do not mean to say that - at least not in the current 
incarnation.  Running with the following parameters avoids the leak…
> 
>     export OMPI_MCA_pml="ob1"
>     export OMPI_MCA_btl_openib_eager_limit=1048576
>     export OMPI_MCA_btl_openib_max_send_size=1048576
> 
> as does building OpenMPI without UCX support (i.e. —without-ucx).   
> 
> However, building _with_ UCX support (including the current github 
source) and running with the following parameters produces
> the leak (note that no PML was explicitly requested).  
> 
>    export OMPI_MCA_oob_tcp_listen_mode="listen_thread"
>    export OMPI_MCA_btl_openib_eager_limit=1048576
>    export OMPI_MCA_btl_openib_max_send_size=1048576
>    export OMPI_MCA_btl="self,vader,openib”
> 
> The eager_limit and send_size limits are needed with this app to 
prevent a deadlock that I’ve posted about previously. 
> 
> Also, explicitly requesting the UCX PML with,
> 
>  export OMPI_MCA_pml=“ucx"
> 
> produces the leak.
> 
> I’m continuing to try to find exactly what I’m doing wrong to produce 
this behavior but have been unable to arrive at 
> a solution other than excluding UCX which seems like a bad idea since 
Jeff (Squyres) pointed out that it is the
> Mellanox-recommended way to run on Mellanox hardware.  Interestingly, 
using the UCX PML framework avoids
> the deadlock that results when running with the default parameters and 
not limiting the message sizes - another
> reason we’d like to be able to use it.
> 
> I can read your mind at this point - “Wow, these guys have really 
horked their cluster”.  Could be.   But we run
> thousands of jobs every day including many other OpenMPI jobs (vasp, 
gromacs, raxml, lammps, namd, etc).
> Also the users of the Arepo and Gadget code are currently running with 
MVAPICH2 without issue.  I installed
> it specifically to get them past these OpenMPI problems.  We don’t 
normally build anything with MPICH/MVAPICH/IMPI
> since we have never had any real reason to - until now.
> 
> That may have to be the solution but the memory leak is so readily 
reproducible that I thought I’d ask about it.
> Since it appears that others are not seeing this issue, I’ll continue 
to try to figure it out and if I do, I’ll be sure to post back.
> 
> > As a side note, we strongly recommend to avoid
> > configure --with-FOO=/usr
> > instead
> > configure --with-FOO
> > should be used (otherwise you will end up with -I/usr/include
> > -L/usr/lib64 and that could silently hide third party libraries
> > installed in a non standard directory). If --with-FOO fails for you,
> > then this is a bug we will appreciate you report.
> 
> Noted and logged.  We’ve been using the —with-FOO=/usr for a long time 
(since 1.x days).  There was a reason we started doing
> it but I’ve long since forgotten what it was but I think it was to _
avoid_ what you describe - not cause it.  Regardless,
> I’ll heed your warning and remove it from future builds and file a bug 
if there are any problems.
> 
> However, I did post of a similar problem previously in when 
configuring against an external PMIx library.  The configure
> script produces (or did) a "-L/usr/lib” instead of a "-L/usr/lib64” 
resulting in unresolved PMIx routines when linking.
> That was with OpenMPI 2.1.2.  We now include a lib -> lib64 symlink in 
our /opt/pmix/x.y.z directories so I haven’t looked to 
> see if that was fixed for 3.x or not.
> 
> I should have also mentioned in my previous post that HPC_CUDA_DIR=NO 
meaning that CUDA support has
> been excluded from these builds (in case anyone was wondering).
> 
> Thanks for the feedback,
> 
> Charlie
> 
> > 
> > Cheers,
> > 
> > Gilles
> > On Fri, Oct 5, 2018 at 6:42 AM Charles A Taylor <chas...@ufl.edu> 
wrote:
> >> 
> >> 
> >> We are seeing a gaping memory leak when running OpenMPI 3.1.x (or 2.
1.2, for that matter) built with UCX support.   The leak shows up
> >> whether the “ucx” PML is specified for the run or not.  The 
applications in question are arepo and gizmo but it I have no reason to 
believe
> >> that others are not affected as well.
> >> 
> >> Basically the MPI processes grow without bound until SLURM kills 
the job or the host memory is exhausted.
> >> If I configure and build with “--without-ucx” the problem goes away.
> >> 
> >> I didn’t see anything about this on the UCX github site so I 
thought I’d ask here.  Anyone else seeing the same or similar?
> >> 
> >> What version of UCX is OpenMPI 3.1.x tested against?
> >> 
> >> Regards,
> >> 
> >> Charlie Taylor
> >> UF Research Computing
> >> 
> >> Details:
> >> —————————————
> >> RHEL7.5
> >> OpenMPI 3.1.2 (and any other version I’ve tried).
> >> ucx 1.2.2-1.el7 (RH native)
> >> RH native IB stack
> >> Mellanox FDR/EDR IB fabric
> >> Intel Parallel Studio 2018.1.163
> >> 
> >> Configuration Options:
> >> —————————————————
> >> CFG_OPTS=""
> >> CFG_OPTS="$CFG_OPTS C=icc CXX=icpc FC=ifort FFLAGS=\"-O2 -g -warn -
m64\" LDFLAGS=\"\" "
> >> CFG_OPTS="$CFG_OPTS --enable-static"
> >> CFG_OPTS="$CFG_OPTS --enable-orterun-prefix-by-default"
> >> CFG_OPTS="$CFG_OPTS --with-slurm=/opt/slurm"
> >> CFG_OPTS="$CFG_OPTS --with-pmix=/opt/pmix/2.1.1"
> >> CFG_OPTS="$CFG_OPTS --with-pmi=/opt/slurm"
> >> CFG_OPTS="$CFG_OPTS --with-libevent=external"
> >> CFG_OPTS="$CFG_OPTS --with-hwloc=external"
> >> CFG_OPTS="$CFG_OPTS --with-verbs=/usr"
> >> CFG_OPTS="$CFG_OPTS --with-libfabric=/usr"
> >> CFG_OPTS="$CFG_OPTS --with-ucx=/usr"
> >> CFG_OPTS="$CFG_OPTS --with-verbs-libdir=/usr/lib64"
> >> CFG_OPTS="$CFG_OPTS --with-mxm=no"
> >> CFG_OPTS="$CFG_OPTS --with-cuda=${HPC_CUDA_DIR}"
> >> CFG_OPTS="$CFG_OPTS --enable-openib-udcm"
> >> CFG_OPTS="$CFG_OPTS --enable-openib-rdmacm"
> >> CFG_OPTS="$CFG_OPTS --disable-pmix-dstore"
> >> 
> >> rpmbuild --ba \
> >>         --define '_name openmpi' \
> >>         --define "_version $OMPI_VER" \
> >>         --define "_release ${RELEASE}" \
> >>         --define "_prefix $PREFIX" \
> >>         --define '_mandir %{_prefix}/share/man' \
> >>         --define '_defaultdocdir %{_prefix}' \
> >>         --define 'mflags -j 8' \
> >>         --define 'use_default_rpm_opt_flags 1' \
> >>         --define 'use_check_files 0' \
> >>         --define 'install_shell_scripts 1' \
> >>         --define 'shell_scripts_basename mpivars' \
> >>         --define "configure_options $CFG_OPTS " \
> >>         openmpi-${OMPI_VER}.spec 2>&1 | tee rpmbuild.log
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to