Charles, are you saying that even if you
mpirun --mca pml ob1 ... (e.g. force the ob1 component of the pml framework) the memory leak is still present ? As a side note, we strongly recommend to avoid configure --with-FOO=/usr instead configure --with-FOO should be used (otherwise you will end up with -I/usr/include -L/usr/lib64 and that could silently hide third party libraries installed in a non standard directory). If --with-FOO fails for you, then this is a bug we will appreciate you report. Cheers, Gilles On Fri, Oct 5, 2018 at 6:42 AM Charles A Taylor <chas...@ufl.edu> wrote: > > > We are seeing a gaping memory leak when running OpenMPI 3.1.x (or 2.1.2, for > that matter) built with UCX support. The leak shows up > whether the “ucx” PML is specified for the run or not. The applications in > question are arepo and gizmo but it I have no reason to believe > that others are not affected as well. > > Basically the MPI processes grow without bound until SLURM kills the job or > the host memory is exhausted. > If I configure and build with “--without-ucx” the problem goes away. > > I didn’t see anything about this on the UCX github site so I thought I’d ask > here. Anyone else seeing the same or similar? > > What version of UCX is OpenMPI 3.1.x tested against? > > Regards, > > Charlie Taylor > UF Research Computing > > Details: > ————————————— > RHEL7.5 > OpenMPI 3.1.2 (and any other version I’ve tried). > ucx 1.2.2-1.el7 (RH native) > RH native IB stack > Mellanox FDR/EDR IB fabric > Intel Parallel Studio 2018.1.163 > > Configuration Options: > ————————————————— > CFG_OPTS="" > CFG_OPTS="$CFG_OPTS C=icc CXX=icpc FC=ifort FFLAGS=\"-O2 -g -warn -m64\" > LDFLAGS=\"\" " > CFG_OPTS="$CFG_OPTS --enable-static" > CFG_OPTS="$CFG_OPTS --enable-orterun-prefix-by-default" > CFG_OPTS="$CFG_OPTS --with-slurm=/opt/slurm" > CFG_OPTS="$CFG_OPTS --with-pmix=/opt/pmix/2.1.1" > CFG_OPTS="$CFG_OPTS --with-pmi=/opt/slurm" > CFG_OPTS="$CFG_OPTS --with-libevent=external" > CFG_OPTS="$CFG_OPTS --with-hwloc=external" > CFG_OPTS="$CFG_OPTS --with-verbs=/usr" > CFG_OPTS="$CFG_OPTS --with-libfabric=/usr" > CFG_OPTS="$CFG_OPTS --with-ucx=/usr" > CFG_OPTS="$CFG_OPTS --with-verbs-libdir=/usr/lib64" > CFG_OPTS="$CFG_OPTS --with-mxm=no" > CFG_OPTS="$CFG_OPTS --with-cuda=${HPC_CUDA_DIR}" > CFG_OPTS="$CFG_OPTS --enable-openib-udcm" > CFG_OPTS="$CFG_OPTS --enable-openib-rdmacm" > CFG_OPTS="$CFG_OPTS --disable-pmix-dstore" > > rpmbuild --ba \ > --define '_name openmpi' \ > --define "_version $OMPI_VER" \ > --define "_release ${RELEASE}" \ > --define "_prefix $PREFIX" \ > --define '_mandir %{_prefix}/share/man' \ > --define '_defaultdocdir %{_prefix}' \ > --define 'mflags -j 8' \ > --define 'use_default_rpm_opt_flags 1' \ > --define 'use_check_files 0' \ > --define 'install_shell_scripts 1' \ > --define 'shell_scripts_basename mpivars' \ > --define "configure_options $CFG_OPTS " \ > openmpi-${OMPI_VER}.spec 2>&1 | tee rpmbuild.log > > > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users