David, Thanks for the reply. I believe the dlopen and Rmpi don't get along because the Rmpi uses fork. That's a vague recollection from several years ago. R is pretty important for us. I believe that leaving dlopen enabled also hits our NFS server harder with I/O requests for the modules.
-- bennet On Tue, Nov 14, 2017 at 11:53 AM, David Lee Braun <dlbr...@umich.edu> wrote: > Hi Bennet, > > what is the issue you have with dlopen? and what options do you use > with mpi --bind? > > i think the only change i make to my openmpi compile is to added > '--with-cuda=...' and '--with-pmi=...' > > D > > On 11/14/2017 10:01 AM, Bennet Fauber wrote: > > We are trying SLURM for the first time, and prior to this I've always > > built OMPI with Torque support. I was hoping that someone with more > > experience than I with both OMPI and SLURM might provide a bit of > > up-front advice. > > > > My situation is that we are running CentOS 7.3 (soon to be 7.4), we use > > Mellanox cards of several generations, but my systems team tells me the > > driver version is the same everywhere. > > > > 82:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 > > 5GT/s - IB QDR / 10GigE] (rev b0) > > > > 16:00.0 Network controller: Mellanox Technologies MT27500 Family > > [ConnectX-3] > > > > We have mixed NFSv3 shared directories and a Lustre filesystem (DDN). > > > > In the past, we had issues with using `dlopen` and we've had much grief > > with OMPI placing jobs on processors correctly, we think because we used > > cpusets at one point and use cgroups now and jobs share nodes with other > > jobs. My previous build options were > > > > export CONFIGURE_FLAGS='--disable-dlopen --enable-shared' > > export COMPILERS='CC=gcc CXX=g++ FC=gfortran F77=gfortran' > > export COMP_NAME='gcc-4.8.5' > > export PREFIX=/shared/nfs/directory > > > > ./configure \ > > --prefix=${PREFIX} \ > > --mandir=${PREFIX}/share/man \ > > --with-tm \ > > --with-verbs \ > > $CONFIGURE_FLAGS \ > > $COMPILERS > > > > Additionally, we have typically included the following lines in our > > $PREFIX/etc/openmpi-mca-params.conf > > > > orte_hetero_nodes=1 > > hwloc_base_binding_policy=none > > > > Those may be there for purely historical reasons. So far as I know, > > there is no deterministic test recorded anywhere that would detect > > whether those are still needed. > > > > For this new resource manager, I am thinking that the compiler flags > > stay the same, but the configure be changed to > > > > ./configure \ > > --prefix=${PREFIX} \ > > --mandir=${PREFIX}/share/man \ > > --with-slurm \ > > --with-pmi=/usr/include/slurm \ > > --with-verbs \ > > $CONFIGURE_FLAGS \ > > $COMPILERS > > > > I am curious, what file system support does --lustre-support enable? > > > > I will be installing three versions of OMPI to start: 1.10.7, 2.1.2, > > and 3.0.0. Are there changes to the configure line that are a priori > > known to be needed. > > > > There are references on the FAQ and other installation notes that lead > > me to believe they are a bit out of date, so I am asking preemptively > > here. Apologies if that is an incorrect assessment. > > > > Thanks, -- bennet > > > > > > > > > > > > _______________________________________________ > > users mailing list > > users@lists.open-mpi.org > > https://lists.open-mpi.org/mailman/listinfo/users > > > > -- > David Lee Braun > Manager of Computational Facilities > for Dr Charles L. Brooks, III Ph.D. > 930 N. University Ave > Chemistry 2006 > (734) 615-1450 > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users