Hi Bennet, Three things...
1. OpenMPI 2.x requires PMIx in lieu of pmi1/pmi2. 2. You will need slurm 16.05 or greater built with —with-pmix 2a. You will need pmix 1.1.5 which you can get from github. (https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pmix_tarballs&d=DwIFaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=8sBODgXZKw_dNqkFqkTqbGD3_7nNlm_pat-D6AqiaC8&m=c-BYSHbQBLKztnmjE6vyXD1qJPjhdol-A6vS7z11_CY&s=8l86GZPJBXZP3xA9iy-tZFiPJ9fhG82mcOFjzz04gRE&e= <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pmix_tarballs&d=DwIFaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=8sBODgXZKw_dNqkFqkTqbGD3_7nNlm_pat-D6AqiaC8&m=c-BYSHbQBLKztnmjE6vyXD1qJPjhdol-A6vS7z11_CY&s=8l86GZPJBXZP3xA9iy-tZFiPJ9fhG82mcOFjzz04gRE&e= >). 3. then, to launch your mpi tasks on the allocated resources, srun —mpi=pmix ./hello-mpi I’m replying to the list because, a) this information is harder to find than you might think. b) someone/anyone can correct me if I’’m giving a bum steer. Hope this helps, Charlie Taylor University of Florida > On Nov 16, 2017, at 10:34 AM, Bennet Fauber <ben...@umich.edu> wrote: > > I think that OpenMPI is supposed to support SLURM integration such that > > srun ./hello-mpi > > should work? I built OMPI 2.1.2 with > > export CONFIGURE_FLAGS='--disable-dlopen --enable-shared' > export COMPILERS='CC=gcc CXX=g++ FC=gfortran F77=gfortran' > > CMD="./configure \ > --prefix=${PREFIX} \ > --mandir=${PREFIX}/share/man \ > --with-slurm \ > --with-pmi \ > --with-lustre \ > --with-verbs \ > $CONFIGURE_FLAGS \ > $COMPILERS > > I have a simple hello-mpi.c (source included below), which compiles > and runs with mpirun, both on the login node and in a job. However, > when I try to use srun in place of mpirun, I get instead a hung job, > which upon cancellation produces this output. > > [bn2.stage.arc-ts.umich.edu:116377] PMI_Init [pmix_s1.c:162:s1_init]: > PMI is not initialized > [bn1.stage.arc-ts.umich.edu:36866] PMI_Init [pmix_s1.c:162:s1_init]: > PMI is not initialized > [warn] opal_libevent2022_event_active: event has no event_base set. > [warn] opal_libevent2022_event_active: event has no event_base set. > slurmstepd: error: *** STEP 86.0 ON bn1 CANCELLED AT 2017-11-16T10:03:24 *** > srun: Job step aborted: Waiting up to 32 seconds for job step to finish. > slurmstepd: error: *** JOB 86 ON bn1 CANCELLED AT 2017-11-16T10:03:24 *** > > The SLURM web page suggests that OMPI 2.x and later support PMIx, and > to use `srun --mpi=pimx`, however that no longer seems to be an > option, and using the `openmpi` type isn't working (neither is pmi2). > > [bennet@beta-build hello]$ srun --mpi=list > srun: MPI types are... > srun: mpi/pmi2 > srun: mpi/lam > srun: mpi/openmpi > srun: mpi/mpich1_shmem > srun: mpi/none > srun: mpi/mvapich > srun: mpi/mpich1_p4 > srun: mpi/mpichgm > srun: mpi/mpichmx > > To get the Intel PMI to work with srun, I have to set > > I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so > > Is there a comparable environment variable that must be set to enable > `srun` to work? > > Am I missing a build option or misspecifying one? > > -- bennet > > > Source of hello-mpi.c > ========================================== > #include <stdio.h> > #include <stdlib.h> > #include "mpi.h" > > int main(int argc, char **argv){ > > int rank; /* rank of process */ > int numprocs; /* size of COMM_WORLD */ > int namelen; > int tag=10; /* expected tag */ > int message; /* Recv'd message */ > char processor_name[MPI_MAX_PROCESSOR_NAME]; > MPI_Status status; /* status of recv */ > > /* call Init, size, and rank */ > MPI_Init(&argc, &argv); > MPI_Comm_size(MPI_COMM_WORLD, &numprocs); > MPI_Comm_rank(MPI_COMM_WORLD, &rank); > MPI_Get_processor_name(processor_name, &namelen); > > printf("Process %d on %s out of %d\n", rank, processor_name, numprocs); > > if(rank != 0){ > MPI_Recv(&message, /*buffer for message */ > 1, /*MAX count to recv */ > MPI_INT, /*type to recv */ > 0, /*recv from 0 only */ > tag, /*tag of messgae */ > MPI_COMM_WORLD, /*communicator to use */ > &status); /*status object */ > printf("Hello from process %d!\n",rank); > } > else{ > /* rank 0 ONLY executes this */ > printf("MPI_COMM_WORLD is %d processes big!\n", numprocs); > int x; > for(x=1; x<numprocs; x++){ > MPI_Send(&x, /*send x to process x */ > 1, /*number to send */ > MPI_INT, /*type to send */ > x, /*rank to send to */ > tag, /*tag for message */ > MPI_COMM_WORLD); /*communicator to use */ > } > } /* end else */ > > > /* always call at end */ > MPI_Finalize(); > > return 0; > } > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users&d=DwICAg&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=HOtXciFqK5GlgIgLAxthUQ&m=t2C9i2WW8vYudLmnfvtKjpqTlBguLeivBwHAaQ1TcM4&s=aakHf5ypdTOe4-hQ86pcEN9FmiW1Xyngln5ODOUwCqQ&e= >
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users