David,

Thanks for the reply.  I believe the dlopen and Rmpi don't get along
because the Rmpi uses fork.  That's a vague recollection from several years
ago.  R is pretty important for us.  I believe that leaving dlopen enabled
also hits our NFS server harder with I/O requests for the modules.

-- bennet



On Tue, Nov 14, 2017 at 11:53 AM, David Lee Braun <dlbr...@umich.edu> wrote:

> Hi Bennet,
>
> what is the issue you have with dlopen?  and what options do you use
> with mpi --bind?
>
> i think the only change i make to my openmpi compile is to added
> '--with-cuda=...' and '--with-pmi=...'
>
> D
>
> On 11/14/2017 10:01 AM, Bennet Fauber wrote:
> > We are trying SLURM for the first time, and prior to this I've always
> > built OMPI with Torque support.  I was hoping that someone with more
> > experience than I with both OMPI and SLURM might provide a bit of
> > up-front advice.
> >
> > My situation is that we are running CentOS 7.3 (soon to be 7.4), we use
> > Mellanox cards of several generations, but my systems team tells me the
> > driver version is the same everywhere.
> >
> > 82:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0
> > 5GT/s - IB QDR / 10GigE] (rev b0)
> >
> > 16:00.0 Network controller: Mellanox Technologies MT27500 Family
> > [ConnectX-3]
> >
> > We have mixed NFSv3 shared directories and a Lustre filesystem (DDN).
> >
> > In the past, we had issues with using `dlopen` and we've had much grief
> > with OMPI placing jobs on processors correctly, we think because we used
> > cpusets at one point and use cgroups now and jobs share nodes with other
> > jobs.  My previous build options were
> >
> > export CONFIGURE_FLAGS='--disable-dlopen --enable-shared'
> > export COMPILERS='CC=gcc CXX=g++ FC=gfortran F77=gfortran'
> > export COMP_NAME='gcc-4.8.5'
> > export PREFIX=/shared/nfs/directory
> >
> > ./configure \
> >     --prefix=${PREFIX} \
> >     --mandir=${PREFIX}/share/man \
> >     --with-tm \
> >     --with-verbs \
> >     $CONFIGURE_FLAGS \
> >     $COMPILERS
> >
> > Additionally, we have typically included the following lines in our
> > $PREFIX/etc/openmpi-mca-params.conf
> >
> >     orte_hetero_nodes=1
> >     hwloc_base_binding_policy=none
> >
> > Those may be there for purely historical reasons.  So far as I know,
> > there is no deterministic test recorded anywhere that would detect
> > whether those are still needed.
> >
> > For this new resource manager, I am thinking that the compiler flags
> > stay the same, but the configure be changed to
> >
> > ./configure \
> >     --prefix=${PREFIX} \
> >     --mandir=${PREFIX}/share/man \
> >     --with-slurm \
> >     --with-pmi=/usr/include/slurm \
> >     --with-verbs \
> >     $CONFIGURE_FLAGS \
> >     $COMPILERS
> >
> > I am curious, what file system support does --lustre-support enable?
> >
> > I will be installing three versions of OMPI to start:  1.10.7, 2.1.2,
> > and 3.0.0.  Are there changes to the configure line that are a priori
> > known to be needed.
> >
> > There are references on the FAQ and other installation notes that lead
> > me to believe they are a bit out of date, so I am asking preemptively
> > here.  Apologies if that is an incorrect assessment.
> >
> > Thanks,  -- bennet
> >
> >
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
> >
>
> --
> David Lee Braun
> Manager of Computational Facilities
> for Dr Charles L. Brooks, III Ph.D.
> 930 N. University Ave
> Chemistry 2006
> (734) 615-1450
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to